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Motivated  by  the  observation  that  the  above  approaches  are  primarily  linear  and  that 
biological  systems,  which  process  information  in  a  highly  nonlinear,  collective,  and 
frequently  iterative  manner,  are  very  adept  at  carrying  out  recognition,  classification, 
association  ,  and  optimization  tasks,  we  elected  to  investigate  the  capabilities  of  collective 
nonlinear  processing  in  target  identification. ^.This  report  describes  our  research  findings  in 
this  area.  Our  approach  was  also  influenced  by  the  observation  that  biological  pattern 
recognition  systems,  e.g.  in  the  cortex,  did  not  develop  in  isolation  but  in  synergism  with 
sensory  organs  and  their  feature  forming  networks.  This  means  that  development  of 
artificial  pattern  and  target  recognition  systems  may  benefit  from  considering  the  data 
acquisition,  representation,  identification,  and  cognition  aspects  of  the.  problem 
simultaneously.  This  unified  approach  to  the  problem  of  neuromoiphic  automated  target 
^cognition  (ATR),  has  produced,  as  described  in  this  report  the  following  findings:  (a) 
The  differential  range-profile  of  an  isolated  target  (e.g.  aerospace  targets)  provides  an 
excellent  feature  vector  for  use  with  adaptive  learning  networks,  (b)  Near  perfect  and 
robust  classification  of  test  targets  is  demonstrated  in  a  multilayer  error  backpropagation 
networks  using  realistic  range-profile  data  generated  in  our  anechoic  chamber  microwave 
scattering  measurement  facility,  (c)  Despite  this  excellent  performance,  such  networks  lack 
cognitive  ability.  This  means  when  the  network  is  presented  with  a  feature  vector  that  does 
not  belong  to  any  one  of  the  targets  used  in  training  it,  it  can  classify  it  as  one  of  the  targets 
it  learned.  The  network  has  no  inherent  ability  to  tell,  on  its  own  (i.e.  without  the  help  of 
auxiliary  gear  acting  as  novelty  detector)  that  the  test  feature  vector  belongs  to  a  novel 
target.  Lack  of  cognition  is  a  serious  limitation  of  networks  meant  to  operate  in  complex 
uncontrolled  environment,  (d)  Most  neural  networks  for  pattern  recognition  being  dealt 
with  today  lack  cognitive  ability.  Preliminary  findings  of  our  research  motivate  us  to  make 
the  following  hypothesis  in  this  regard:  To  be  cognitive,  a  neural  network  must  be 
nonlinear  and  dynamical  and  able  to  manifest  bifurcation.  This  means  it  should  be  able  to 
carry  out  computations  with  more  than  one  type  of  attractor  in  its  phase-space  and  to  be 
able  to  switch  between  them  depending  on  whether  the  sensory  input  is  familiar  or  novel. 
Our  future  research  will  be  aimed  at  validating  this  hypothesis.  Demonstrating  its 
practicality  in  an  engineering  sense  can  have  far  reaching  implications.  For  example,  it 
could  enable  a  combat  aircraft,  not  only  detect  with  its  radar  another  at  200  nautical  miles, 
as  is  common  today,  but  also  to  identify  it  cognitively  without  having  to  form  an  image. 
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1.  STATEMENT  OF  THE  PROBLEM  STUDIED 


Radar  targets  and  other  microwave  scattering  objects  can  be  identified  by 
either  forming  images  with  sufficient  resolution  to  be  recognized  by  the  human 
observer  or  by  forming  representations  (signatures  or  feature  vectors)  of  the  target 
and  using  them  in  automated  machine  recognition.  Tomographic  Microwave 
diversity  imaging  techniques  that  combine  angular  (aspect),  spectral  (wavelength), 
and  polarization  degrees  of  freedom  can  produce  images  of  the  3-D  distribution  of 
scattering  centers  of  a  target  with  near  optical  resolution.  Despite  this  capability 
there  are  practical  circumstances  when  the  size  and/or  cost  of  the  physical  aperture 
needed  to  furnish  the  required  angular  degrees  of  freedom  is  too  high,  or  when  the 
time  delay  involved  in  synthesizing  such  an  aperture  through  relative  motion 
between  the  radar  system  and  the  object  being  imaged  (as  for  example,  in  SAR  and 
ISAR)  is  not  acceptable.  One  is  faced  then  with  the  problem  of  having  to  identify  the 
target  from  a  limited  amount  of  information  that  is  insufficient  to  produce  an 
identifiable  image. 

A  number  of  approaches  have  been  studied  and  explored  in  the  past  to 
circumvent  this  problem  [l]-[4].  Generally  these  have  met  with  limited  success. 
They  include  super-resolution  by  analytic  continuation  and  singularity  expansion 
methods.  The  reason  for  the  limited  success  of  these  approaches  is  that  they  are 
primarily  linear. 

Humans,  and  other  animals,  recognize  objects  in  their  environment  with 
great  ease.  This  is  essential  for  their  survival.  They  do  this  also  with  robustness,  i.e. 
even  when  objects  are  partially  obscured  or  when  the  data  they  convey  to  sensory 
organs  are  corrupted  by  noise  and  the  signal  levels  involved  vary  over  very  wide 
dynamic  range.  Moreover  the  recognition  task  is  easily  achieved  even  when  the 
object  is  not  isolated  but  exists  in  the  prsence  of  clutter  (background).  These 
functional  capabilities  are  attributed  to  the  collective  nonlinear,  nature  of  signal 
processing  in  the  central  nervous  system.  Biological  neural  nets  and  their  models 
furnish  accordingly  an  intriguing  paradigm  that  is  worth  emulating  in  artificial 
man  made  systems.  Such  systems  can  be  of  great  utility  in  patem  recognition, 
solution  of  optimization  problems  and  inverse  scattering  problems  and  in 
associative  storage  and  recall  of  information  (associative  memory). 

The  goal  of  research  described  in  this  report  is  study  of  the  neural  approach  to 
signal  processing  and  assessment  of  its  utility  in  target  recognition  and  image 
understanding.  In  particular  robust  target  recognition  from  sketchy  (partial  and/or 
noisy)  information  is  of  primary  interest.  The  approach  adopted  in  our 
investigation  is  to  study  several  interrelated  facets  of  the  problem.  These  include: 
(a)  Microwave  data  acquisition  and  image  understanding,  (b)  data  representation 
which  involves  formation  of  signature  vectors  or  feature  vectors  that  can  help 
achieve  robust  distortion  invariant  recognition.  By  distortion  invariance  we  mean 
recognizing  the  target  irrespective  of  aspect,  distance,  or  location  within  the  field  of 
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view  (rotation,  size,  and  shift  invariance  in  the  pattern  recognition  literature).  By 
robust  we  mean  recognition  from  sketchy  information  over  a  wide  dynamic-range 
of  signal  levels  and  interrogating  feature  vectors,  (c)  assess  and  demonstrate  the 
capabilities  of  neural  computation  in  the  solution  of  selected  inverse  scattering 
problems  (image  reconstruction  and  object  recognition),  (d)  study  of  analog 
hardware  implementation  of  neural  networks  and  learning  machines  employing 
photonic  (optoelectronic,  electron-optical,  electro-optical)  technology. 

Extensive  efforts  in  data  acquisition  and  microwave  image  understanding 
and  image  reconstruction  employing  diversity  information  and  range-profile 
representations  (see  Appendices  I  to  V)  were  carried  out  to  evaluate  and  establish 
the  viability  of  range-profiles  as  signature  or  feature  vectors  suitable  for  use  not  only 
in  microwave  image  reconstruction,  but  also  as  will  be  seen  below,  in  automated 
neuromorphic  target  classification  and  cognition. 

We  elected  to  study  neuromorphic  radar  target  recognition  of  aerospace 
targets  because  such  targets  are  isolated  and  clutter  is  minimal.  This  makes  the 
problem  less  difficult  than,  for  example,  object  recognition  by  the  visual  system  in 
natural  scenes  where  isolating  the  object  from  background  comprises  a  complex  task 
apparently  carried  out  by  the  eye-brain  system  routinely  through  a  mechanism  of 
attention  focusing  whose  exact  details  are  not  fully  known.  We  believe,  progress 
with  the  aerospace  target  recognition  problem  can  be  helpful  in  the  problem  of  3-D 
object  rcognition  in  natural  scenes.  Another  reason  for  our  choice  of  the  radar  target 
recognition  of  aerospace  targets  is  the  ability  to  generate  realistic  scattering  data  and 
signatures  of  scale-models  of  targets  of  interest  in  our  anechoic  chamber  microwave 
scattering  facility.  The  facility  provides  semi-automated  measurement  of  the 
frequency  response  of  test  objects  over  any  frequency  (spectral)  window  in  the  (2- 
26.5)GHz  frequency  range  for  any  target  aspect  and  any  desired  state  of  polarization 
of  the  transmitter  and  the  receiver.  (See  Appendix  IV  for  detail.)  The  range-profile 
representation  alluded  to  earlier  is  the  real-part  of  the  Fourier  transfrom  of  the 
measured  frequency  response  of  the  target  after  removal  of  the  range-phase.  A 
target  is  characterized  by  either  its  frequency  response  (measured  frequency  response 
corrected  for  range-phase  due  to  propagation  between  the  phase  center  of  the 
transmitting/ receiving  antenna  and  the  scattering  phase-center  of  the  target)  for  all 
aspect  angles  of  interest  or  by  the  corresponding  range-profiles.  In  our  work  we  refer 
to  the  range-profiles  variably  as  echo,  signature  vector  or  feature  vector.  When 
sufficiently  wide  spectral  windows  are  used,  in  data  acquisition,  the  echo  or  range- 
profile  echo  from  the  target  is  an  approximation  of  the  impulse  response  of  the 
target  produced  by  impulsive  plane  wave  illumination. 


2.  SUMMARY  OF  THE  MOST  IMPORTANT  RESULTS 


Our  initial  efforts  in  assessing  the  capabilities  of  neurocomputing  in 
microwave  scatterer  identification  made  use  of  a  fully  connected  neural  network 
operating  as  heteroassociative  memory.  (See  Appendix  IV.)  The  connection 
weights  between  neuron  in  the  network  were  computed  off-line  and  set  in  the 
network.  The  network  consisting  of  32x32  binary  neurons  was  implemented  in 
software.  The  network  was  formed  from  sinogram  representations  of  three  test 
targets:  scale  models  of  a  B-52,  AWAC,  and  Space  Shuttle.  The  sinogram 
representation  is  basically  a  binarized  cartesian  plot  of  range-profiles  versus  aspect 
angle  for  a  fixed  elevation  angle  of  the  target.  When  tested  with  partial  versions  of  a 
sinogram,  the  network  was  able  to  classify  the  target  to  which  the  data  belonged 
correctly.  Partial  data,  down  to  a  fraction  of  nearly  10%  of  the  full  sinogram 
representation  was  found  able  to  produce  correct  classification  of  the  three  targets. 
This  network  demonstrated  clearly  the  distinctive  features  of  neural  processing  i.e. 
collective,  and  nonlinear,  signal  processing  as  compared  to  conventional  signal 
processing:  the  functions  of  data  storage,  processing,  and  object  labeling  are 
performed  by  the  same  elements  of  the  network.  This  is  unlike  conventional  signal 
processing  where  these  functions  are  normally  carried  out  by  separate  elements  of 
the  system.  This  means  that,  when  the  network  is  mplemented  in  hardware,  the 
three  functions  listed  above,  wold  be  carried  out  by  the  same  hardware.  The 
network  required  in  its  operation  that  the  aspect  angles  at  which  the  test  data  were 
collected  be  known.  Although  it  is  possible  to  obtain  this  information,  it  dictates  in 
practice  the  use  of  auxiliary  tracking  radars  and  additional  signal  preprocessing  to 
determine  the  target  orientation  relative  to  the  radar  line-of-sight  at  which  the 
range  profile  data  comprising  the  test  data  was  acquired.  This  complication  can  be 
avoided  if  one  can  design  a  network  capable  of  classifying  a  target  from  few  echos  or 
from  a  single  echo  or  "look"  (single  range-profile)  without  having  to  specify  at  what 
aspect  angle  of  the  target  the  echo  occurs.  This  capability  is  highly  desirable  and  is 
important  from  a  practical  viewpoint. 

To  investigate  the  feasibility  of  robust  radar  target  recognition  from  a  single 
look  we  examined  next  the  performance  of  a  multilayered  feedforward  error  back- 
propagation  network.  The  network  we  set  up  was  an  outcome  of  an  investigtion  we 
carried  out  of  a  learning  network  for  extrapolation  and  target  identification.  This 
network  (see  Appendix  VI),  consisted  of  three  layers:  an  input  layer  of  101  complex 
neurons  representing  the  complex  frequency  response  of  the  target,  a  hidden  layer  of 
101  real  neurons,  and  an  output  layer  of  N=2  binary  neurons  capable  of  presenting 
2^=4  target  labels  which  can  classify  up  to  four  targets.  The  connection  weights 
between  the  input  layer  and  the  hidden  layer  are  those  of  the  discrete  Fourier 
transform  kernel  and  are  fixed.  The  connection  weights  between  the  hidden  layer 
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and  the  output  layer  are  adaptive  and  are  determined  by  an  error  driven  supervised 
learning  algorithm.  The  network  was  trained  on  the  (6-17)GHz  frequency  response 
data  of  three  test  objects:  scale  models  of  a  B-52,  a  Boeing  707  and  a  space  shuttle.  We 
found  this  net  can  learn  the  frequency  responses,  or  corresponding  range-profiles,  of 
the  three  test  targets  constituting  the  training  set.  Following  training,  the  net  is  able 
to  classify  any  one  of  the  frequency  responses  of  an  object  presented  to  it  by 
associating  it  with  the  correct  object  label  formed  by  neurons  of  the  output  layer. 
When  a  two-out-of-three  majority  vote  was  adopted  in  keeping  score  of  the 
network's  performance  as  frequency  response  echos  were  presented  to  it,  the  score 
was  found  to  be  perfect  even  when  only  35%  of  the  training  set  of  each  target  was 
employed  in  training  the  network.  This  constitutes  good  generalization  and  means 
that  a  network  need  not  be  trained  with  very  large  numbers  of  feature  vectors  before 
it  can  capture  the  underlining  structure  of  the  target.  It  is  worh  noting,  that  unlike 
the  preceeding  network  this  network  does  not  require  aspect  information  in  its 
performance.  In  addition  it  was  found  that  the  network  has  excellent  robustness.  In 
that  the  excellent  performance  cited  above  can  be  maintained  even  at  very  low 
signal-to-noise  ratio  and  over  very  wide  dynamic  range  of  the  frequency  response 
data.  (See  Appendix  VI  for  detail.)  The  network  achieves,  despite  its  simplicity  (it  is 
essentially  a  one  layer  perceptron  network),  our  stated  goal  namely  that  of  robust 
distortion  invariant  classification  of  training  targets. 

At  this  stage  of  our  research  we  thought  that  we  had  realized  the  task  we  set 
out  to  achieve.  We  were  quickly  disappointed.  Despite  of  the  excellent  performance 
capabilities  cited  above,  neither  this  network,  nor  the  network  described  before,  are 
of  any  practical  use.  Both  networks  lack  cognitive  ability.  When  presented  with 
novel  data  from  a  target  the  net  has  not  been  trained  upon,  i.e.  has  not  seen  before, 
it  could,  because  of  the  lack  jf  cognition,  classify  it  erroneously  as  one  of  the  targets 
it  has  learned.  This  lack  of  cognition  is  a  serious  problem  facing  practical 
applications  of  neural  network  that  need  to  operate  in  complex  uncontrolled 
environments.  This  point  requires  some  clarification.  One  can  train  a  network  for 
example  to  recognize  handwritten  zip  code  numbers.  The  trained  network  is  useful 
because  it  is  only  meant  to  recognize  zip  code  numbers.  No  one  is  going  to  use  it  to 
recognize  the  Japanese  alphabet  for  example.  It  is  designed  to  operate  in  a  controlled 
environment.  This  is  not  true  for  a  neural  net  designed  to  recognize  radar  targets 
because  the  environment  in  which  the  net  is  intended  to  operate  is  not  controlled. 
Targets  other  than  those  the  network  is  trained  to  recognize  can  occur  in  its 
environment.  There  are  two  possible  solutions  to  this  problem  that  come  to  mind. 
One  is  to  train  the  network  with  every  target  it  could  conceivably  encounter  in  its 
environment.  This  is  not  practical,  because  even  if  details  can  be  worked  out,  it 
would  result  in  very  large  networks  of  unacceptable  size.  The  second  solution,  and 
this  is  often  invoked  by  workers  in  the  field  when  they  realize  that  the  network  they 
developed  is  not  cognitive,  is  to  incorporate  a  "novelty  filter."  This  consists  of  using 
auxilliary  gear  that  can  measure  other  attributes  of  the  target  such  as  size,  speed, 
altitude,  etc.  and  use  these  attributes  to  decide  whether  the  target  encountered  is  of 
interest  or  not,  i.e.,  whether  the  output  of  the  neural  network  engaged  is  to  be  taken 
seriously  or  not.  The  disadvantage  of  this  approach  is  increased  complexity  and  cost 
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of  the  data  acquisition  system.  Biological  neural  networks  possess  inherent 
cognitive  abilities.  There  is  no  doubt  that  multisensory  modalities;  which  can  be 
viewed  as  providing  something  akin  to  novelty  filtering  are  involved  in  networks 
of  the  brain  to  reduce  ambiguities.  There  must  be  however  more  to  it  than  that. 
We  have  started  recently  a  study  of  the  issue  and  are  finding  the  results  obtained  so 
far  most  intriguing.  We  believe  this  study  will  lead  to  ways  of  designing  a  new 
generation  of  neural  networks  with  inherent  cognitive  ability.  Before  discussing 
our  findings  in  this  regard  we  will  briefly  summarize  the  findings  of  our 
investigation  in  data  acquisition  and  representation. 

The  range-profile  of  a  target  for  a  given  aspect  resembles  the  impulsive  plane 
wave  illumination  of  the  target  for  that  aspect  provided  the  spectral  window  used  in 
data  acquisition  is  sufficiently  wide.  A  general  criterion  for  selecting  the  spectral 
width  A f  is  8  =  c/2Af  where  8  is  the  desired  range  resolution  on  the  target  and  c  is  the 
velocity  of  light.  In  general  terms,  8  corresponds  to  the  size  of  the  finest  detail  on 
the  target  (and  hence  in  its  image)  needed  to  distinguish  it  from  other  targets. 
Because  the  form  of  the  echo  (temporal  impulse  response)  produced  by  an 
impulsive  plane  wave  sweeping  the  target  is  independent  of  range  to  the  target  (it 
only  depends  on  aspect),  range-profile  data  ensures,  when  used  as  feature  vector  in  a 
neural  based  radar  target  recognition  scheme,  that  performance  is  independent  of 
range.  Invariance  with  target  location  within  the  field  of  view  is  obtained  then  by 
aiming  the  T/R  antenna  of  the  acquiring  radar  at  the  target  at  all  times  by  precise 
tracking.  Invariance  with  aspect  is  achieved  then  by  training  a  suitable  neural 
network  with  the  normalized  range-profile  data  collected  over  the  solid  angle  of 
encounter  of  every  target  the  network  is  required  to  learn.  Angular  sampling 
considerations  applied  to  a  target  of  extent  L  dictate  that  the  number  of  range- 

profiles  needed  to  characterize  the  target  is  given  by  N  =  ft/8^  where  8Q  *  (X/ L)2 
with  X  being  the  mean  wavelength  used  in  data  acquisition  and  ^  is  the  solid  angle 
of  encounter.  Values  of  N  for  typical  aerospace  targets  and  practical  spectral 
windows  can  therefore  be  quite  large.  The  generalization  ability  of  trainable  neural 
networks  discussed  earlier  means  that  the  network  need  not  be  trained  with  every 
one  of  its  N  range-profile  but  only  with  a  fraction  of  them  which  we  call  the  training 
set.  This  helps  reduce  training  time.  The  training  set  can  be  selected  randomly  or 
uniformly  over  the  solid  angle  of  encounter. 

The  ultimate  goal  of  our  data  acquisition  and  representation  effort  is  to  show 
that  range-profile  data  collected  in  a  controlled  anechoic  chamber  environment 
employing  scale  models  of  targets  of  interest  can,  by  paying  careful  attention  to 
scaling  issues  based  on  the  principle  of  electromagnetic  similitude  [51,  be  used  to 
recognize  the  actual  size  targets  by  conventional  broad-band  coherent  radar  systems 
in  the  field.  When  the  conductivity  o  of  scale  models  and  actual  targets  is  very  high 
(o  — >  <*>),  electromagnetic  similitude  considerations  show  that  scaling  is  satisfied 
when  data  acquisition  with  a  scale  model  that  is  n  times  smaller  than  the  actual 
target  is  carried  out  over  a  frequency  range  that  is  n  times  greater  than  h-equenc) 
range  of  the  actual  radar  used  in  the  field. 


We  return  now  to  describing  our  findings  so  far  regarding  introducing 
cognition  to  neural  networks.  The  majority  of  neural  networks  described  in  the 
literature,  compute  by  forming  point  attractors  in  phase-space  with  prescribed  basins 
of  attraction.  We  have  recently  described  an  error-driven  algorithm  for  forming 
string  attractors  in  phase-space  of  networks  with  synchronously  updated  neurons 
and  proposed  its  use  in  target  recognition,  [6]  (see  also  Appendix  VII).  A  string 
attractor  is  basically  a  point  attractor  with  filamentary,  rather  than  "volumetric", 
basin  of  attraction.  Synchronicity  and  its  role  in  feature  binding  is  receiving 
increased  attention  in  the  literature  [7]-[9L  but  the  question  of  how  to  achieve  feature 
binding  in  practice  has  thus  far  received  little  attention.  The  learning  rule  gi^en  in 
[6]  for  forming  a  string  attractor,  i.e.  storing  a  sequence  of  vectors  in  a  network, 
applies  also  to  forming  a  periodic  attractor  by  closing  the  string  on  itself.  The 
characteristics  of  periodic  and  string  attractors,  revealed  so  far  in  our  work  are:  (a) 
High  storage  capacity  M^N  where  M  is  the  number  of  vectors  stored  in  the  sequence 
and  N  is  the  number  of  neurons  in  the  network.  For  example,  M=4(l  bipolar  binary 
vectors  were  stored  in  a  network  of  N=32  bipolar  binary  neurons  in  less  than  50 
training  cycles,  (b)  Arbitrary  (i.e.  highly  correlated  and  nonorthogonal)  vectors  can 
be  stored  in  sequence,  (c)  Initialed  from  any  member  of  the  stored  sequence,  a  string 
attractor  network  cycles  through  all  subsequent  vectors  and  terminates  on  the  last 
stored  vector  in  the  string  while  a  periodic  attractor  network  would  cycle  repeatedly 
through  all  vectors  stored  which  equivalent  to  producing  a  periodic  spatio- 
temporal  oscillation  of  states  of  the  neuron  population,  (d)  Highly  isolated  periodic 
and  string  attractors  are  formed  with  the  degree  of  isolation  controlled  by  the 
threshold  level  of  neurons.  By  this  we  mean,  for  relatively  high  neuron  thresholds, 
initiators  (initiating  state  vectors)  with  Hamming  distance  d^>l  from  any  of  the 
stored  vectors  do  not  trigger  the  periodic  attractor  but  cause  the  network  instead  to 
bifurcate  and  converge  to  a  limit  point  which  is  usually  a  ground  state  or  one  close 
to  it.  (e)  Several  nonintersecting  periodic  or  string  attractors  may  be  stored  in  the 
same  network,  (f)  The  learning  rule  in  [6]  for  storing  sequences  scales  well  with 
network  size,  for  instance,  networks  with  32,  64  and  128  neurons  were  tested  and  all 
showed  similar  behavior,  (g)  Sequence  of  arbitrary  unipolar  binary  vectors  can 
also  be  stored  provided  the  vectors  are  not  too  sparse. 

Periodic  attractor  networks  with  the  above-listed  properties,  and  particularly 
(d),  offer  a  possible  mechanism  for  cognition  in  that  when  the  vectors  stored  are 
feature  vectors  representing  an  object  in  its  different  manifestations,  and  the 
initiating  vector  is  one  of  the  stored  feature  vectors,  or  is  close  to  any  one  of  them  in 
the  Hamming  sense  (e.g.  dj-i<l),  the  periodic  attractor  will  be  triggered.  Now  if  a 
label  vector,  identifying  the  object,  was  imbedded  earlier  on  in  the  periodic  attractor 
when  it  was  formed,  it  would  also  be  triggered  once  every  period  signaling  thereby 
that  the  input  is  one  of  the  feature  vectors  stored.  Because  of  the  high  degree  of 
isolation  of  a  periodic  attractor  achieved  by  proper  choice  of  neuron  threshold  an 
initiating  input  vector  with  Hamming  distance  dppl  would  not  trigger  the 
sequence  or  the  imbedded  label.  Instead,  the  network  bifurcates  and  switches  its 


operation  to  computing  with  point  attractors  whereby  it  proceeds  to  converge 
rapidly  to  a  fixed  ground  state,  where  all  neurons  are  in  their  low  binary  state,  or 
one  close  to  it  and  this  would  serve  as  an  indication  within  the  network  that  the 
input  is  not  familiar  providing  thus  cognition. 

Feature  vectors  of  more  than  one  object  can  be  stored  in  separate  non¬ 
intersecting  periodic  attractors  containing  imbedded  labels  in  the  same  network. 
Attractors  and  labels  are  triggered  in  such  a  network  only  if  the  initiator  is  of 
Hamming  distance  dj-j<l  from  one  of  the  vectors  stored  in  an  attractor.  Novel 
initiators  will  not  trigger  any  of  the  labels  and  this  provides  such  a  network  with 
ability  to  finely  distinguish  if  certain  feature  vectors  are  present  in  its  environment. 

Despite  its  potential  usefulness  for  feature  binding,  periodic  attractor 
networks  are  void  of  generalization  because  of  their  high  isolation.  A  slight  change 
in  a  feature  vector  that  triggers  the  attractor  renders  it  ineffective  causing  the 
network  to  bifurcate.  This  means  that  a  recognized  object  can  stop  being  recognized 
if  its  feature  vectors  change  to  the  slightest.  This  suggests  that  periodic  attractor 
networks  need  to  be  used  with  additional  networks  that  can  furnish  the 
generalization  capabilities  needed  in  order  to  provide  the  composite  network  with 
cognition  and  robustness  at  the  same  time.  Presently,  we  are  seeking  methods  for 
imparting  prescribed  domains  of  attraction  for  each  vector  stored  in  the  periodic 
attractor.  This  would  provide  the  periodic  attractor  with  controlled  basin  of 
attraction.  Initial  results  suggest  that  this  can  be  achieved  by  combining  periodic 
attractor  networks  with  arrays  of  feedforward  feature  extracting  networks. 
Advantages  of  this  hierarchial  approach  to  network  construction  we  are  noting  at 
this  very  preliminary  stage,  are  modularity  and  potential  reduction  of  learning  time 
even  in  large  networks  because  of  segmentation.  All  this  appears  to  be  achieved 
while  enjoying  the  good  robustness  and  noise  immunity  of  feedforward  learning 
networks.  Although  such  feedforward  networks  provide  robustness  and  noise 
immunity,  they  lack  cognition.  Cognition  is  provided  by  the  periodic  attractor.  This 
approach  could  provide  us,  for  the  first  time,  with  a  way  for  combining  distinct 
neural  network  or  neural  modules  in  such  a  way  as  to  achieve  higher  level 
processing  such  as  cognition. 

Finally  we  report  on  our  findings  in  the  area  of  photonic  or  optoelectronic 
implementation  of  neural  networks.  Interest  in  artificial  neural  networks 
implemented  in  analog  hardware  rather  than  digital  software  stems  primarily  from 
their  potential  speed  advantage.  The  photonic  approach  is  motivated  by  the  desire 
to  combine  the  best  attributes  of  optics,  namely  parallelism  and  massive 
interconnectivity,  with  the  best  attributes  of  electronics,  decision  making 
(nonlinearity)  and  gain.  During  the  period  of  this  report  we  designed  constructed 
and  studied  the  performance  of  what  we  believe  to  the  first  stochastic  photonic 
learning  machine  (see  Appendix  VIII  for  detail).  Learning  in  this  machine  is 
stochastic  taking  place  in  a  self-organizing  tri-layered  opto-electronic  neural  net  with 
plastic  connectivity  weights  that  are  formed  in  a  programmable  nonvolatile  spatial 
light  modulator.  The  net,  which  can  also  be  called  a  Boltzmann  Learning  Machine, 


learns  by  adapting  its  connectivity  weights  in  accordance  to  environmental  inputs. 
Learning  is  driven  by  error  signals  derived  from  state-vector  correlation  matrices 
accumulated  at  the  end  of  fast  annealing  bursts  that  are  induced  by  controlled  optical 
injection  of  noise  into  the  network.  Operation  of  the  machine  is  made  possible  by 
two  important  developments  in  our  work:  Fast  annealing  by  optically  induced 
noisy  thresholding,  and  stochastic  learning  with  binary  weights.  Results  obtained 
with  a  24  neuron  prototype  partitioned  into  three  layers  with  8  input,  8  hidden,  and 
8  output  neurons  show  that  the  machine  can  learn,  with  a  score  of  about  95%,  to 
associate  two  8-bit  vector  pairs  in  10-60  minutes  with  relatively  slow  (60  msec 
response  time)  neurons.  Shifting  to  neurons  with  1  )isec  response  time  for  example, 
could  reduce  the  learning  time  by  roughly  10^  times.  Slow  neurons  were 
deliberately  used  to  make  it  easier  to  visually  examine  and  record  the  changing  state 
vector  of  the  network  as  it  operates  which  is  displayed  with  an  array  of  LEDs. 
Increasing  the  number  of  hidden  neurons  in  this  machine  from  8  to  16  is  shown,  by 
numerical  simulations,  to  increase  the  learning  score  to  100%.  The  spatial  light 
modulator  (SLM)  used  in  constructing  the  machine  had  to  be  of  the  nonvolatile 
variety.  The  one  such  SLM  available  to  us  at  the  time,  (and  still  is)  was  the 
magneto-optic  SLM.  A  scheme  for  enhancing  the  frame  rate  of  this  SLM  from 
video  rate  to  1000  frames/sec  to  speed-up  learning  was  developed  [10]. 
Unfortunately,  the  pixels  of  this  device  have  binary  (on-off)  transmission  only.  This 
restricted  the  connection  weights  of  the  neurons  in  the  machine  to  binary  values. 
All  adaptive  learning  algorithms  require  analog  weights.  To  overcome  this 
limitation  we  developed  a  scheme  for  Boltzmann  machine  learning  with  binary 
weights  (see  Appendix  VIII).  Although  effective  in  learning,  the  number  of 
associations  the  network  could  learn  with  the  binary  weights  scheme  is  less  than 
what  it  can  learn  with  analog  weights.  This  underlines  the  importance  of 
developing  programmable  nonvolatile  spatial  light  modulators  for  use  in  photonic 
learning  machines. 
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Alntraci— The  microwave  image  of  a  metallic  ohject  is  interpreted 
front  a  new  point  of  view,  based  on  the  understanding  of  the  interenn- 
nfdion  between  the  scattering  mechanisms,  the  data  acquisition  svs- 
tem,  and  the  image  reconstruction  algorithm.  From  this  understanding 
we  are  able  to  interpret  and  predict  microwave  images  reconstructed 
czi-rAA.  from  data  collected  over  specihedyand  angular  windows.  The  connec¬ 
tion  between  a  special  scattering  mechanism,  edge  diffraction,  and  its 
reconstructed  image  is  established.  The  microwave  image  of  an  edge 
will  he  two  bright  points  whose  locations  correspond  to  the  end  points 
of  the  edge  if  the  normal  aspect  angle  is  not  included  in  the  angular 
windows:  otherwise  a  line  joining  the  two  end  points  and  representing 
the  edge  will  appear  in  the  image.  Experimental  images  ol  a  trihedral 
reflector  reconstructed  from  data  collected  over  different  angular  win¬ 
dows  support  this  new  approach  to  image  interpretation  and  predic¬ 
tion. 

I.  Introduction 

Microwave  diversity  imaging  is  an  imaging  technique  that  ex¬ 
ploits  possible  degrees  of  freedom,  including  spectral,  angular,  and 
polarization  diversities  (1).  In  this  imaging  system,  an  object  is 
seated  on  a  rotating  pedestal  and  Is  illuminated  by  a  plane  wave. 
For  each  aspect  angle  a  set  of  pulses  at  different  frequencies  is 
transmitted  and  ns  echoes  are  received.  The  object  is  then  rotated 
and  the  measurement  is  repeated  to  obtain  the  multiaspect  stepped 
frequency  response  of  the  scattering  object 

In  the  microwave  regimes,  the  physical  optics  (POi  approxi¬ 
mation  is  usually  used  lo  model  the  scattered  held  of ’a  conducting 
object  ll  w ,i s  shown  that  a  three-dimensional  (3-D)  Found  irqns- 
fomi  i  F  f  i  relationship  exists  between  the  shape  of  a  periedlv  con¬ 
ducting  object  and  its  backscattered  far  field  under  the  PO  approx¬ 
imation  12)  However,  the  PO  approximation  is  inadequate  for 
scattering  problems  of  a  complex  shaped  conducting  object.  At 
high-frequency  edge  diffractions,  multiple  reflections,  creeping 
waves,  and  surface  traveling  waves  are  also  important  ^.altering 
mccnariiMiis  im  A  field  scattered  from  these  scattering  mecha¬ 
nisms  cannot  be  treated  by  the  PO  approximation  Additionally. 
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the  spectral  and  angular  windows  for  the  daia  are  usually  restricted 
by  practical  constraints.  Therefore,  the  microwave  image  of  a  me¬ 
tallic  object  might  be  different  from  its  geometrical  shape. 

In  this  paper  wc  w  ill  investigate  microwave  images  of  metallic 
objects  employing  microwave  diversity  imaging  from  a  new  point 
of  view  ,  based  on  the  understanding  of  the  interconnection  between 
the  object  scattering  mechanisms,  the  data  acquisition  system,  and 
the  image  reconstruction  algorithm  utilized  in  image  retrieval  The 
image  reconstruction  algorithm  can  be  either  the  Fourier  transform 
method  or  the  back-projection  method,  and  these  two  methods  yield 
equivalent  results  [l|.  (4|  However,  the  back-projection  method 
provides  more  physical  insight  into  the  image  formation  process 
1 5 1  Basically,  the  image  is  formed  in  three  steps:  1)  measure  the 
scattered  field  over  a  specified  spectral  window  and  angular  win¬ 
dow;  2)  obtain  the  range  profile,  which  is  the  inverse  FT  of  the 
range-corrected  frequency  response,  at  each  aspect  angle;  and  3) 
back-project  the  range  profile  of  each  aspect  angle  onto  an  image 
plane  to  obtain  the  image.  Wc  will  interpret  and  predict  the  micro- 
wave  image  based  on  the  above  three. steps. 

A  different  scattering  mechanism  might  produce  a  different  ap¬ 
pearance  in  us  microwave  image.  In  this  paper  wc  will  only  deal 
with  a  special  scattering  mcchanism^-cdge  diffraction  For  those 
objects  consisting  of  conducting  plates,  edge  diffractions  are  dom¬ 
inant  contributors  to  the  scattered  field  when  the  receiver  is  not  in 
the  specular  direction  of  any  one  of  the  visible  plates  comprising 
the  object.  To  a  first  order  approximation  the  field  scattered  from 
the  above  type  objects  can  be  considered  as  a  summation  of  the 
contnbutions  from  each  "visible”  plate,  and  the  scattered  field  of 
a  plate  can  be  considered  as  a  summation  of  the  diffracted  field 
from  each  "visible"  edge  Therefore,  diffraction  from  an  edge  is 
the  basic  building  block  lor  the  scattering  problem  of  those  objects 
consisting  of  conducting  plates 

In  Section  II  the  scattered  field  from  an  edge  with  finite  length 
will  be  reviewed,  the  physical  properties  of  its  range  profile  will 
be  explained,  and  the  image  formation  for  an  edge  with  finite  length 
will  be  discussed.  A  trihedral  reflector  is  an  object  consisting  of 
conducting  plates.  Experimental  images  of  a  trihedral  reflector  re¬ 
constructed  from  data  collected  over  different  angular  w  indow  s  w  ill 
be  demonstrated  and  imcrpreicd  in  Section  III. 

II.  Scattered  Field.  Range  Profile,  and  Image 
Formation  of  a  Finite  Edge 

Consider  a  conducting  plate  placed  on  a  rotating  pedestal  as  il¬ 
lustrated  in  Fig.  I.  Points  P,  and  P;  are  two  vertices  of  the  plate 
and  the  line  PP;  forms  an  edge  of  (he  plate.  In  the  laboratory 
coordinate  system,  define  the  ;-axis  in  the  direction  of  the  rota¬ 
tional  axis,  jnd  the  i-axis  in  the  direction  of  the  line  of  sight  At 
the  starting  angle  the  polar  coordinates  of  the  end  points  P ,  and  P: 
are  ( r,.  0, .  o,  i  and  I  r:,  ti:.  ©, ),  respectively.  As  the  plate  is  rotated 
with  an  angle  o  the  coordinates  of  P i  and  P;  become  (  r, .  0,.  ©, 

©)  and  (r;.  6:,  ©-  +  ©).  respectively.  The  differential  ranees  of 
these  two  end  points  a!  rotation  angle  ©  are  then  r,  sin  0  cos  (  o, 
*  ©)  and  r.  sm  0:  cos  (  o-  *  oi.  respectively  ll  is  noted  ihal  the 
dependence  of  the  differential  ranges  of  the  end  points  on  the  ro¬ 
ta;. on  uii^.'e  is  wnusoid.il 

Next  we  define  an  edge-hxed  coordinate  for  the  plate.  Let  the 
;  -axis  be  in  the  direction  of  the  edge  P.P-,  and  the  x  -axis  he  nor¬ 
mal  to  the  edge  and  ly  ing  on  the  plate  surface.  The  corresponding 
inclination  angle  of  the  transmiuer/rcceiver  to  the  edge-fixed  co¬ 
ordinate  system  is  8' <  o  =  0°  i.  As  the  plate  is  rotated  through  an 
angle  ©.  the  corresponding  inclination  angle  for  the  edge-lived  co¬ 
ordinate  system  I’etonics  "'o)  ll  is  noted  that  O'  is  not  only  a 
function  ol  ©  hi.:  jlw>  a  function  ot  the  .mentation  ol  t he  plale  and 
t he  edge 

The  diffracted  field  ol  a  wedge  wuh  finite  length  for  arbitrary 
incident  and  diffracted  angles  has  been  treated  |6],  where  the  con- 
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Fig  I  Geometry  and  coordinates  of  an  edge  in  the  laboratory  coordinate 
system  and  edge-fixed  coordinate  system 


cept  o‘  equivalent  electrical  current  and  equivalent  magnetic  cur¬ 
rent  has  been  applied.  Denote  the  equivalent  electric  current  and 
equivalent  magnetic  current  on  the  edge  as  Hz' )  and  M(z').  The 
expressions  of  I  and  M  for  the  backscattering  case  can  be  found  in 
|6|.  They  are  functions  of  the  inclination  angle  and  the  azimuth 
angle,  and  are  inversely  proportional  to  the  wavenumber  k 

The  backscattcrcd  field  of  an  edge  with  finite  length  L  expressed 
in  the  edge-fixed  coordinate  system  can  be  written  by  [6] 


_  MW  ^  L  sin  <kL  cos  «' 

4ir  r 


kL  cos  8‘ 


(I) 


„  join  of"'*'  .  _ ,  sin  (kL  cos  6' 

E»  =- - t|  sin  6  M(0)  L — — - — 

4ir  r  kL  cos  8 


(2) 


where  /( 0)  and  M(  0)  are  the  equivalent  electnc  current  and  equiv  - 
alcnt  magnetic  current  at  z'  =  0,  and  ij  is  the  characteristic  imped¬ 
ance  of  the  free  space. 

At  a  specific  aspect  the  range  profile  is  obtained  by  FT  the  range- 
corrected  frequency  response.  After  range  correction  (t.e  ,  the  first 
two  terms  on  the  right  of  (1)  and  (2)  being  removed),  the  range- 
corrected  field  can  be  further  simplified  to 


Ln 


sin  8' 1(0)  L 


^jkLc**9  _  £  -/i/.itY s6‘ 

IjkL  cos  8' 


(3) 


r;  sin  8' M( 0)  L 


e  e  - /H  vmf 

2 jkL  cos  8' 


14  ) 


The  FT  of  (3)  and  (4)  with  respect  to  2k  over  a  finite  bandwidth 
will  give  two  peaks,  located  at  range  about  ±  (  L/2  )  cos  8' .  which 
are  at  the  differential  ranges  of  the  end  points  of  the  edge,  with 
amplitude  proportional  to  HO)  or  M(0).  and  I  /( L  cos  8  )  it  8'  * 
90°  Al  the  rotation  angle  <S  such  that  8‘  (6)  =  90°.  the  range 
profile  gives  a  single  peak  with  strong  magnitude  because  the  two 
end  points  of  the  edge  have  the  same  differential  ranges  and  a. I  the 
points  on  tl  .  edges  are  in  equidistance  to  the  observation  point 
After  realizing  the  aspect  dependence  of  the  range  profile  ol  the 
edge,  we  can  then  form  and  predict  the  image  of  an  edge  by  the 
technique  of  back -projection  (4).  (5|.  ]7]  After  back-projection, 
the  contributions  of  a  specific  range  profile  to  the  reconstructed 
image  will  be  two  parallel  lines  oriented  in  the  direction  ot  o  Be¬ 
cause  the  tr.i-.e  ol  each  end  pomi  versus  the  rotation  ang'e  -  -  uu- 
soidal.  all  bask-projcction  lines  for  various  rotation  angle-  <:.■  w.  ,-i 


(a) 


Fig  2.  Image  formation  of  an  edge  (a)  Sketch  of  the  range  profiles  of  an 
edge  at  various  aspect  angles,  (b)  Implementation  of  the  back-projection 
Sketches  of  the  images  after  back-projection  (c)  including  and  (d)  with¬ 
out  the  aspect  angle  such  that  the  edge  is  normal  to  the  line  ot  sight 


pass  through  the  corresponding  end  point  as  in  the  case  of  com¬ 
puter-aided  tomography  (CAT),  intensifying  the  brightness  of  the 
end  points.  However,  when  the  aspect  such  that  8  (0)  =  90°  is 
within  the  angular  window  (i.e..  the  aspect  at  which  the  incident 
wave  is  normal  to  the  edge  is  contained  within  the  angular  win¬ 
dow).  the  back-projection  due  to  this  range  profile  will  be  a  single 
bright  line. 

The  above  explanation  is  illustrated  in  Fig  2  At  rotation  angle 
Oo  the  edge  is  normal  to  the  line  of  sight  ( i  e  ,  8'  ( On )  =  90° ).  and 
the  range  profile  for  that  aspect  has  a  single  peak  with  large  am¬ 
plitude.  When  plate  is  rotated  to  another  angle  O,,  the  range 
profile  has  two  peaks  located  at  d',  and  d‘‘  with  smaller  amplitudes 
(see  Fig.  2(a)).  The  implementation  of  back-projection  is  illus¬ 
trated  tn  Fig.  2(b).  The  sketches  of  the  image  after  back-projection 
including  and  excluding  the  specular  aspect  are  shown  in  Fig  2(c) 
anti  id),  respectively  The  above  discussions  and  illustrations  in¬ 
dicate  that  the  microwave  image  ol  an  edge  will  be  two  bright  points 
whose  locations  correspond  to  the  end  points  of  the  edge  it  the 
norma!  aspect  angle  is  not  included  in  the  angular  window,  other¬ 
wise  a  line  |oimng  the  two  pomls  and  representing  the  edge  will 
appear  in  the  image 

III  Mickow avi  Images  of  -\  Trihedral  Reelector 

To  verify  the  new  interpretation  approach,  the  microwave  im¬ 
ages  ol  a  trihedral  reflector  reconstructed  from  data  collected  over 
various  angular  windows  arc  demonstrated  below 

The  geometry  ol  a  trihedral  redectoi  and  the  imaging  arrange 
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Fig  3.  Geometry,  imaging  arrangement,  fringe  pattern,  and  sinogram  ol 
a  trihedral  reflector  (al  Geometry  and  imaging  arrangement  of  a  tri¬ 
hedral  reflector  (bi  Real  part  of  the  range -corrected  frequency  response 
of  the  trihedral  lei  Sinogram  of  the  trihedral  reflector 


ment  are  shown  in  Fig  3(a)  The  transmitting  and  receiving  anten¬ 
nas  have  opposite  senses  of  circular  polarization  101  equal  fre¬ 
quency  steps  covering  the  6-16.5  GHz  range  were  used  to  obtain 
the  frequency  response  of  the  trihedral  reflector  The  object  is  ro¬ 
tated  clockwise  360  ;  The  real  pan  of  the  range-corrected  complex 


Fig  4  Photo  image  and  microwave  images  of  a  trihedral  reflector  (al 
Microwave  projective  image  reconstructed  from  data  collected  over  an¬ 
gular  window  from  4>  =  0°  to  220°  (b)  Projective  photo  image  Micro- 
wave  projective  images  reconstructed  from  data  collected  over  (cl  an¬ 
gular  window  1  ( g>  =  0°  to  40° );  id)  angular  window  2  (©  =  40°  to 
80°  );  (e)  angular  window  3  ( C>  =  80°  to  120°  >;  If  I  angular  w  indow  4 
( 0  =  120°  to  160°  >;  (g)  angular  w  indow  5  l  o  =  160°  to  220“ );  and 
<h)  angular  window  6(0  =  220°  to  360° ) 


frequency  response  of  the  trihedral  reflector  is  shown  as  a  central 
slice  of  Fourier  data  in  Fig.  3(b)  The  radial  distance  of  a  given 
sample  in  this  plot  represents  the  frequency  while  the  polar  angle 
represents  the  rotation  angle  e>  The  brightness  of  each  point  is 
proportional  to  the  amplitude  of  the  frequency  response  The  range- 
profiles  for  all  aspect  angles  are  represented  as  a  sinogram  The 
sinogram  representation  has  been  used  in  CAT  |7)  and  is  applied 
here  to  represent  the  range  profiles  of  various  aspect  angles.  It  is  a 
2-D  intensity  varied  display  with  the  abscissa  of  the  differential 
range,  the  ordinate  of  the  aspect  angle,  and  the  intensity  or  bright¬ 
ness  proportional  to  the  amplitude  of  the  range  profile  The  sino¬ 
gram  of  the  trihedral  reflector  is  shown  in  Fig  3(0  The  bottom 
line  represents  the  range  profile  of  the  first  aspect  (  ©  =  0°  )  while 
the  top  line  represents  the  range  profile  of  the  last  aspect  angle 
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The  magnitude  of  the  range  profile  is  proportional  to  the  brightness 
of  the  display  and  the  sinogram  is  displayed  in  linear  scale.  The 
dynamic  display  range  has  been  suitably  chosen  so  that  weak  sig¬ 
nals  will  not  be  overridden.  Examining  the  sinogram  one  can  find 
that  bright  points  are  present  in  certain  aspect  angles.  Locations  of 
(hese  bright  points  correspond  to  the  differential  ranges  of  the  "vis¬ 
ible"  edges  that  are  normal  to  the  line  of  sight. 

The  image  reconstructed  through  an  angular  window  covering 
from  ©  =  0°  to  220°  is  shown  in  Fig.  4(a).  It  is  seen  that  the  image 
is  a  projective  image  projected  onto  the  plane  normal  to  the  rota¬ 
tional  axis.  The  optical  projective  image  is  also  shown  in  Fig.  4(b) 
for  comparison.  To  verify  our  new  image  interpretation  approach 
stated  in  the  previous  section,  we  divide  the  whole  angular  window 
into  six  subwindows  and  reconstruct  the  image  from  each  subwin¬ 
dow.  The  resultant  images  are  shown  in  Fig.  4(c)-(h).  Examining 
ihe  resultant  images,  one  can  find  that  only  those  edges  that  are 
normal  to  the  line  of  sight  within  the  specified  angular  window 
appear  in  the  image.  The  brightness  of  the  end  points  of  the  edges 
has  been  intensified.  It  is  noted  that  no  edges  are  normal  to  the  line 
of  sight  in  angular  window  3.  Accordingly,  no  edges  are  present 
in  the  image  while  the  brightness  of  the  vertices  are  intensified 
Furthermore,  the  effect  of  multiple  reflections  is  pronounced  for 
some  aspects  in  this  w  indow  as  can  be  seen  from  the  fringe  pattern 
(Fig.  3(b))  and  the  sinogram  (Fig.  3(c)).  Multiple  reflection  usually 
distorts  the  image  because  ihe  range  profile  does  not  reflect  the 
range  information  of  the  obiect  shape  (5),  (81.  In  angular  window 
b  strong  multiple  reflections  are  dominant  contributors  to  the  scat¬ 
tered  held  for  most  aspects.  Although  the  edges  can  still  be  seen 
in  the  image,  artifacts  due  to  multiple  reflections  distort  the  image. 

IV.  Discussion  and  Conclusion 

In  this  paper  we  interpret  the  microwave  image  of  a  metallic 
object  from  a  new  approach,  based  on  the  understanding  of  the 


interconnection  between  the  scattenng  mechanisms,  ihe  data  ac¬ 
quisition  system,  and  the  image  reconstruction  algorithm.  From 
this  understanding  we  can  interpret  and  predict  the  microwave  im¬ 
age  reconstructed  from  data  collected  over  specified  spectral  and 
angular  windows.  Experimental  results  support  this  new  approach 
to  image  interpretation.  Although  the  scattenng  mechanism  treated 
in  this  paper  is  confined  to  the  edge  diffraction,  ihe  same  approach 
can  also  be  applied  to  establish  the  connection  between  the  other 
scattenng  mechanisms  and  their  reconstructed  images  |5],  Suc¬ 
cessful  interpretation  and  prediction  of  the  microwave  image  are 
fundamental  to  research  in  several  areas,  including  target  identifi¬ 
cation.  classification,  radar  cross-section  reduction,  and  image  dis¬ 
tortion  (8] . 
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Abstract — The  range  profiles  and  images  of  a  straight  wire  without  and 
with  lumped  impedance  loading  are  discussed  and  demonsir-jird.  The 
scattering  mechanisms  of  a  straight  wire  are  qualitative!.*  anal*/td  Plots 
of  range  profiles  at  different  aspect  angles  show  (hat  surface  traveling 
waves  are  important  scattering  mechanisms  of  a  straight  wire.  The 
presence  of  traveling  waves  makes  ringlike  artifacts  appear  in  the 
reconstructed  images.  It  is  found  that  lumped  impedance  loading  can 
effectively  distort  the  range  profiles  and  microwave  images  of  a  wire 
scatterer.  In  addition,  randomly  varied  reactive  loading  can  generate 
random  peaks  in  the  range  profiles. 

1.  Introduction 

HE  PROPERTIES  of  the  field  scattered  from  a  scatterer 
loaded  with  lumped  impedance  have  been  extensively 
studied  [l]-[3].  Several  interesting  phenomena  can  then  be 
deduced  from  the  variation  of  the  scattered  fields.  Fixed  linear 
impedance  loading  can  change  the  natural  frequencies  of  a 
target  (4J.  Time-varying  loading  can  make  (he  receiver  unable 
to  phase  lock  to  the  frequency  of  the  incident  wave  and  can 
shift  the  apparent  frequency  of  the  scattered  field  to  provide  a 
false  Doppler  shift  [3) .  It  can  also  spread  the  spectrum  of  the 
scattered  field  to  decrease  the  energy  within  the  bandwidth  of  a 
receiver  [5].  The  sensitivity  of  these  phenomena  to  impedance 
loading  has  been  discussed  in  [5], 

In  this  paper,  we  will  discuss  two  other  properties,  range 
profiles  and  images  of  a  straight  w  ire  without  and  with  lumped 
impedance  loading.  The  range  profile  of  a  scatterer  at  an 
aspect  is  defined  as  the  Fourier  transform  (FT)  of  the 
frequency  response  of  the  scattered  field  at  that  aspect.  Range 
profiles  can  give  useful  insight  into  the  scattering  mechanisms 
of  a  scatterer.  After  the  range  profiles  are  obtained,  an  image 
can  then  be  formed  by  back  projecting  the  complex  range 
profiles  at  various  aspects  into  the  imaging  plane  [6] . 
Microwave  images  of  conducting  objects  have  been  inter¬ 
preted  satisfactorily  through  the  understanding  of  the  scatter¬ 
ing  mechanisms  of  the  object  and  the  procedures  of  the  image 
reconstruction  algorithm  [7)  Different  scattering  mechanisms 
can  produce  different  appearances  of  microwave  image-'  li 
was  reported  that  a  surtax  traveling  wave  is  the  dominant 
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scattering  mechanism  for  the  backscattered  field  of  a  long  thin 
wire  with  near  end-on  incidence  [8 1 .  Therefore,  it  will  be  of 
interest  to  examine  the  appearance  of  the  image  for  objects 
with  this  special  scattering  mechanism. 

In  some  applications  it  is  desirable  to  distort  the  image  of  a 
target  so  that  it  cannot  be  recognized  by  radar  [9],  If  the  wire  is 
loaded  with  impedance,  the  impedance  discontinuities  on  the 
wire  surface  will  cause  extra  radiation.  The  effect  of  impe¬ 
dance  loading  on  microwave  images  has  not  yet  been  reported. 
Monochromatic  imaging  of  a  monopole  antenna  has  been 
studied  holographically  by  Iizuka  and^regoria  [10],  who  was 
interested  in  visualizing  resonance  effects.  However,  what  we 
are  interested  in  is  a  wire  scatterer  rather  than  an  antenna  and  a 
frequency  diversity  image  instead  of  a  monochromatic  image. 

In  this  paper  we  will  use  the  moment  method  to  numerically 
calculate  the  field  scattered  from  a  loaded  straight  wire 
.scatterer  and  derive  its  range  profile  and  microwave  image 
from  the  calculated  scattered  fields.  A  qualitative  analysis  of 
the  scattering  mechanisms  of  a  straight  w  ire  will  be  given  first. 
This  analysis  is  then  examined  via  plots  of  the  numerical  and 
experimental  range  profiles,  which  are  then  compared  w  ith  the 
experimental  results.  The  effect  of  impedance  on  the  range 
profiles  and  the  reconstructed  images  will  be  presented  and 
discussed. 

II.  Scattering  Mechanisms  of  a  Straight  Wire 

Radiation  can  originate  from  several  places  on  an  arbitrarily 
shaped  wire  object.  These  include  the  excitation  region,  an 
impedance  load,  a  change  in  radius,  a  sharp  bend,  a  smooth 
curve,  and  an  open  end  [ 3 1 .  Consider  a  straight  wire 
illuminated  by  an  impulsive  plane  wave  with  angle  of 
incidence  6  as  shown  in  Fig  1  In  this  scattering  arrangement 
the  only  places  which  cause  radiation  are  the  end  points  of  the 
wire.  The  pulse  traveling  in  free  space  impinges  on  the  upper 
end  point  first.  Pan  of  the  incident  energy  is  then  reradiated. 
and  the  remaining  energy  continues  to  travel  along  the  w  ire. 
This  traveling  pulse  will  be  partly  rcradiated  when  it  reaches 
the  lower  end  and  partly  reflected  upward  along  the  w  ire.  This 
process  of  radiation  and  reflection  continues  until  the  pulse 
dies  out.  The  original  pulse  propagating  in  free  space  hits  the 
lower  end  point  some  time  after  it  impinges  on  the  upper  end 
point.  The  process  of  radiation,  reflection,  and  guided 
propagation  along  the  wire  w  ill  then  occur  just  as  in  the  ease  of 
the  upper  end  point  Diffcreni  wave  motions  resulting  Irom 
multiple  interaction--  between  the  ends  of  the  wire  are 
indicated  in  Fig.  1  The  differential  path  length  /,,  which  is  the 
Hh  wave  motion  path  relative  to  the  path  length  when  the 


0018-926X/89  01 00 -0094 $01 .00  ©  1989  IEEE 


U  f!  nl..  PROFILES  AND  IMAGES  OF  LOADED  STRAIGHT  LINE 


95 


Fig  2.  Numerical  range  profiles  of  straight  wire  for  a 


id) 

:s  of  incidence  equal  to  lai  30°.  (b)  45°.  (c)  60°  (d>  75*. 


impulsive  illumination  hits  the  center  point  of  the  wire,  is  as 
follows: 

/i  =  -  h  cos  0  -  h  cos  9  =  -  2/i  cos  0 
/ '  =  h  cos  0  -  h  cos  Q  =  2h  cos  0 
/;=  -  h  cos  B  ^  2h  +  h  cos  9  =  2 h 
=  h  cos  0  2 h  -  h  cos  B  =  2h 

/,=  -/?  cos  0  +  4/f  -  h  cos  B  =  4h-2h  cos  0 
/ ,'  =  h  cos  0  4  h  +  h  cos  0  =  4  h  +  2  h  cos  0 
U=  -  h  cos  0  *  6/i  +  h  cos  9  =  6  h 
/4'  =  h  cos  Q  +  6h  -  h  cos  0  =  6 h. 

Ill  Range  Profiles  and  Reconstructed  Images 

To  examine  the  previous  analysis,  we  use  the  moment 
method  to  theoretically  calculate  the  field  scattered  from  a 
straight  wire  scatterer  from  which  we  derive  the  range  profiles 
and  reconstruct  the  image  The  piecewise  sinusoidal  Galerkin 
method  is  used  [11],  Let  the  ratio  of  the  length  to  the  radius  be 
100,  the  length  of  wire  be  30  cm,  and  the  frequency  coverage 
be  from  6  to  16  GHz  In  other  words,  the  length  in  terms  of 
wavelength  ranges  from  6  to  16  The  polarization  of  the 
incident  field  is  assumed  to  be  ^-polarized.  Shown  in  Fig  2  are 
the  magnitudes  of  the  range  profiles  for  angles  of  incidence 


equal  to  30°,  45°,  60°,  and  75°.  The  location  of  the  peak 
marked  with  i  corresponds  to  the  differential  path  length  of  the 
ith  wave  motion  shown  in  Fig.  1 .  If  we  carefully  examine  the 
range  profiles  shown  in  Fig.  2.  we  can  find  that  the  peaks 
marked  with  1  and  1  depart  more  f  rom  the  center  as  the  angle 
6  decreases,  while  the  peaks  marked  w  ith  2  and  2 '  and  4  and 
4'  remain  at  the  same  position  and  thus  are  independent  of  the 
angle  of  incidence.  Those  peaks  marked  with  3  move  toward 
2,  while  those  marked  with  3  '  move  toward  4 '  as  6  increases. 
These  observations  verify  the  analysis  stated  in  Section  II 

A  real  thin  rod  with  length  12"  and  diameter  1/8"  is  used  as 
a  test  object  to  experimentally  verify  the  numerical  results. 
The  measurement  arrangement  is  shown  in  Fig.  3.  The  wire 
scatterer  is  mounted  on  a  rotating  pedestal  controlled  by  a 
microcomputer.  A  set  of  step  frequencies  are  transmitted,  and 
the  scattered  field  is  received.  After  that,  the  object  is  rotated 
and  the  measurement  is  repeated.  The  frequency  coverage  is 
from  6  to  16  5  GHz.  The  polarization  status  of  the  transmitting 
and  receiving  antennas  are  righthand  circularly  polarized  and 
lefthand  circularly  polarized,  respectively  A  bisiatic  angle  of 
16°  exists  between  these  two  antennas.  The  range  profiles  of 
this  thin  rod  at  several  aspects  are  shown  in  Fig.  4  In  this 
bistatic  case  the  differential  path  lengths  of  path  2  and  path  2' 
are  not  equal  This  fact  accounts  for  the  discrepancy  between 
the  experimental  and  numerical  range  profiles. 

From  the  previous  analysis  and  the  range  profiles  shown  in 
Figs.  3  and  4.  one  can  see  that  the  phenomenon  of  surface 
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Fie  4  Experimental  range  profiles  of  thin  rod  for  angles  of  incidence  equal  to  (al  30”.  (b)  45”.  (c)  60°.  (d)  75” 


traveling  waves  is  quite  evident  in  the  straight  wire  case  if  the 
rotation  center  is  chosen  at  the  center  point,  the  effect  of 
constant  ranges  (2/7,  6h,  etc.)  on  the  reconstructed  image  will 
be  rings  with  constant  radius  [7],  Shown  in  Figs.  5(a)  and  5(b) 
are  the  numerical  and  experimental  images  reconstructed  from 
the  data  collected  over  an  angular  window  in  6  from  20°  to 
80°.  respectively.  It  is  seen  that  ringlike  artifacts  appear  in  the 
images  and  the  end  points  are  intensified.  This  example  shows 
that  the  presence  of  traveling  waves  usually  degrades  the 
imace. 

IV.  Effect  of  Impedance  Loading 

The  geometry  of  a  loaded  straight  wire  is  shown  in  Fig.  6.  If 
three  lumped  resistors,  each  with  resistance  50  fi,  are  added  at 
Ci  =  0.5 h,  C;  =  0,  Ct  =  -0.5 h,  these  loading  points  will 
cause  extra  reflections.  Both  the  incident  wave  impinging  on 
the  loading  points  and  the  waves  traveling  along  the  wire 
arriving  at  the  loading  point  will  cause  additional  reflections 
This  fact  results  in  additional  peaks  in  the  range  profiles. 


lal  Ihi 


Fig  5  (a)  Numerical  im.i_e  reconstructed  from  data  collected  over  angular 

window  from  ft  -  20”  to  80”.  (bl  Experimental  image  reconstructed  from 
the  data  collected  over  an  angular  window  from  8  -  20”  to  80” 

Shown  in  Fig.  7  are  the  range  profiles  of  the  threc-loading- 
point  wire  at  several  angles  of  incidence.  Examining  these 
plots,  one  finds  that  more  lobes  appear  and  the  lobe  produced 
bv  the  loading  point  is  not  as  narrow  as  those  produced  by  the 
end  points.  Furthermore,  the  number  of  lobes  between  1  and 
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Fig  6.  Geometry  of  loaded  siraiphi  wire. 


Fig.  7. 


distance  in  cm 
(c) 


distance  in  cm 
Id) 


Numerical  range  profiles  of  straight  wire  with  three  loading  points  at  angles  of  incidence  equal  to  (a)  30".  (b)  45*.  (c)  60*. 
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1 '  is  not  necessarily  equal  to  the  number  of  loading  points  (for 
example,  see  Fig.  7(b)  with  0  =  45°).  These  additional  lobes 
are  due  to  reflections  of  the  traveling  waves. 

To  examine  the  loading  effect  experimentally,  we  divide  the 
thin  rod  into  three  sections  with  a  l-mm  gap  between  them. 
These  gaps  are  expected  to  produce  a  loading  effect.  However, 
it  is  difficult  to  assign  a  loading  value  in  each  gap  In  addition, 
the  equivalent  loading  impedance  is  also  a  function  of 
frequency  because  the  gap  distance  in  terms  of  wavelength  is 
changed  with  frequency  The  experimental  range  profiles  for 
several  angles  of  incidence  are  shown  in  Fig.  8.  Extra  peaks 
appear  in  the  range  profiles  due  to  the  discontinuities  of  the 
gaps  However,  the  magnitudes  of  these  peaks  differ  from  the 
counterparts  of  Fig.  7, 

The  numerical  and  expcrimenl.il  images  reconstructed  trom 
the  data  collected  over  an  angular  window  from  6  --  20"  to 


80°  arc  shown  in  Figs.  9(a)  and  9(b),  respectively.  It  is  seen 
that  the  loading  impedance  and  the  surface  traveling  waves 
have  distorted  the  images.  By  comparing  the  images  of  Figs.  9 
and  5,  one  can  conclude  that  the  images  have  been  successfully 
distorted  bv  impedance  loading.  However,  the  price  paid  is  an 
increase  in  the  radar  cross  section  (RCS)  [5], 

Finally,  we  examine  the  effect  of  time-varying  loading  on 
the  range  profiles.  Range  profiles  usually  give  the  maximum 
and  minimum  range  information  of  an  object  along  the 
direction  of  propagation,  which  in  turn  provide  the  informa¬ 
tion  of  the  target  dimension.  In  some  applications,  it  is  desired 
to  distort  the  range  profile  so  that  the  radar  cannot  deduce  the 
ohiect  dimension  from  the  range  profile.  One  may  use  an 
active  b mad -band  slave  jammer  to  distort  the  range  informa¬ 
tion,  but  this  is  not  what  we  wish  to  discuss.  We  try  to  use  a 
passive  impedance  load  to  achieve  this  goal.  Impedance 
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Fig.  8  Experimental  range  profiles  ot  three-segment  thin  rod  at  angles  of  incidence  equal  to  (al  30”.  (b)  45”.  (cl  60"  (d>  75° 


Fig  9.  (a)  Numerical  image  of  straight  wire  with  three  loading  points 

reconstructed  from  data  collected  over  angular  window  from  8  =  20”  to 
80”.  (b)  Experimental  images  of  three-segment  thin  rod  reconstructed  from 
data  collected  over  angular  window  from  8  -  20”  to  80” 


loading  can  change  the  magnitude  and  phase  of  the  scattered 
fields.  If  the  loaded  values  are  randomly  varied  for  each  time 
instant,  the  randomness  might  cause  random  peaks  in  the 
range  profile. 

It  has  been  concluded  that  reactive  loading  can  make  a  more 
drastic  change  in  the  scattered  fields  (either  in  phase  or 
amplitude)  than  resistive  loading  can.  and  increasing  the 
number  of  loading  points  can  produce  greater  field  variation 
[5].  It  is  also  known  that  the  reflection  coefficient  at  a  given 
point  is  a  function  of  the  characteristic  impedance  and  the 
loading  impedance  at  that  point.  If  the  loading  impedance  at  a 
point  is  randomly  switched  between  capacitive  loading  and 
inductive  loading,  the  phase  of  the  reflection  coefficient  at  that 
point  will  be  changed  at  each  time  instant.  Consequently,  the 
peak  of  the  range  profile  at  that  point  may  be  reduced,  and  the 
phase  variation  of  the  backscattered  field  between  two  adjacent 
frequencies  may  be  more  abrupt.  This  will  increase  the 
effectiveness  of  the  random  loading  in  distorting  the  range 
profile. 

In  the  following  we  compare  the  effect  of  a  fixed  loading,  a 
randomly  varied  resistive  loading,  and  a  randomly  varied 


reactive  loading  on  the  range  profiles  of  a  loaded  straight  wire. 
The  parameters  used  are  number  of  loading  points  n  equal  to  5 
and  6  =  45°.  Each  loading  impedance  is  either  fixed  to  50  Q, 
or  randomly  resistively  varied  from  0  to  100  fi,  or  randomly 
reactively  varied  from  -j 50  to  +  j 50  0.  The  range  profiles  of 
the  three  loading  cases  are  shown  in  Figs.  10(a),  10(b),  and 
10(c),  respectively.  From  the  figure  one  can  find  that  the 
difference  in  the  range  profiles  with  fixed  resistive  loading  and 
that  with  randomly  varied  resistive  loading  is  small,  and 
randomly  varied  resistive  loading  does  not  create  random 
peaks. 

V.  Conclusion 

The  scattering  mechanism  of  a  straight  wire  has  been 
qualitatively  analyzed.  Plots  of  range  profiles  at  different 
aspect  angles  show  that  surface  traveling  waves  are  important 
scattering  mechanisms  of  a  straight  wire.  The  presence  of 
traveling  waves  makes  ringlike  artifacts  appear  in  the  image  of 
a  straight  wire.  It  is  also  found  that  lumped  impedance  loading 
can  effectively  distort  the  range  profiles  and  the  reconstructed 
images  Furthermore,  randomly  varied  reactive  loading  can 
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Abstract—  A  new  iterative  method  for  extrapolation  of  incomplete 
segmented  data  available  in  multiple  separated  bands  is  proposed  and 
tested.  The  method  uses  the  Burg  algorithm  to  find  the  linear  prediction 
parameters  and  an  iterative  procedure  to  improve  the  estimation  of  the 
linear  prediction  parameters  and  the  extrapolation  of  the  data.  This 
method  is  especiall)  effective  when  the  spectra  (Fourier  transform  of  the 
observed  data)  are  in  discrete  forms.  In  the  context  of  radar  imaging 
represented  here,  this  means  the  objects  consist  of  distinctly  spaced 
scattering  centers.  The  advantages  of  this  algorithm  are  demonstrated 
using  both  numerically  generated  and  realistic  experimental  data  pertain¬ 
ing  to  high  resolution  radar  imaging. 

I.  Introduction 

T  IS  WELL  KNOWN  that  the  resolution  of  microwave 
diversity  imaging  systems  [1]  depends  on  the  spectral  and 
angular  (aspect  related)  windows.  To  obtain  the  range  infor¬ 
mation  of  the  target,  one  can  use  a  pulsed  signal  analyzed  in 
the  time  domain  and  map  the  range  profile  of  the  target  as 
function  of  aspect  angle  or  use  a  broad  band  continuous  wave 
(CW)  signal  analyzed  in  the  frequency  domain  to  yield  its 
frequency  response.  The  range  resolution  is  inversely  propor¬ 
tional  to  the  bandwidth  coverage  of  the  measurement  system. 
In  practical  situations,  however,  due  to  limitation  of  the 
measurement  system  or  restriction  of  bandwidth  allocation, 
the  observed  data  can  lie  in  multiple  restricted  spectral  regions 
which  we  call  passbands.  Several  methods  of  extrapolating  the 
measured  data  beyond  the  observed  regions  have  been 
proposed  and  tested  [21-14]  in  an  attempt  to  achieve  the  full 
resolution  of  the  unrestricted  spectral  range,  when  a  prior 
knowledge  of  the  maximum  dimension  of  the  object  exists 
and  an  iterative  procedure  is  applied.  The  use  of  linear 
prediction  for  the  interpolation  and  extrapolation  of  missing 
data  and  data  gaps  has  also  been  reported  [5J. 

To  increase  the  resolution  obtained  from  spectral  data  of 
such  limited  extent,  techniques  of  nonlinear  power  spectrum 
estimation  have  been  used  with  notable  success  [6].  These 
include  autoregression  (AR),  linear  prediction  (LP),  and 
maximum  entropy  method  (MEM)  and  multiple  signal  classifi¬ 
cation  (MUSIC)  algorithm  (7j.  For  a  stationary  Gaussian 
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process,  the  First  three  methods  above  can  be  shown  to  be 
equivalent  [6]. 

Most  of  the  nonlinear  spectrum  estimation  techniques  are 
developed  to  process  the  data  in  the  time  and  frequency 
domains.  However,  there  is  an  analogy  between  the  time- 
frequency  domain  and  space-spatial-frequency  domain.  In 
microwave  diversity  imaging,  for  a  given  aspect  angle,  the 
frequency  response  of  the  scattered  fields  corresponds  to  a  set 
of  time-series  data,  while  the  square  of  the  absolute  value  of 
the  range  profile,  which  is  defined  as  the  Fourier  transform  of 
the  frequency  response,  corresponds  to  the  power  spectrum. 

It  is  known  that  the  linear  prediction  method  is  especially 
suited  for  those  cases  when  the  spectra  are  in  discrete  form. 
Under  high-frequency  conditions,  the  scattered  field  from  a 
complex  shaped  target  can  be  attributed  to  a  few  discrete 
scattering  centers  that  include  edges.  It  will  be  shown  that 
under  the  high  frequency  approximation  the  locations  of  the 
scattering  centers  and  their  scattering  strengths  are  indepen¬ 
dent  of  the  operating  frequency  for  a  given  transmitter/ 
receiver  pair.  This  is  equivalent  to  saying  that  the  spectra  (or 
range  profiles)  of  the  scatterer  are  also  of  discrete  form.  These 
phenomena  provide  the  motivation  to  apply  the  linear  predic¬ 
tion  method  to  microwave  diversity  imaging. 

Although  the  spectra  estimated  by  MEM  or  AR  can  be  very 
sharp  and  well  resolved,  this  may  not  be  an  advantage  in  a 
microwave  imaging  system.  If  the  data  are  not  sampled 
densely  enough  in  the  spectral  domain,  the  sharp,  well- 
resolved  components  may  be  missed,  and  the  results  may  not 
faithfully  reflect  the  actual  spectral  amplitudes.  Besides,  image 
reconstruction  from  microwave  diversity  imaging  systems 
involves  coherent  superposition  of  the  data  in  the  spectra,  or 
range  profiles,  of  the  scatterer  (obtained  at  different  aspect 
angles),  where  these  arc  estimated  from  partial  data  available 
in  segmented  bands  (1).  If  the  estimated  amplitudes  of  the 
range  profiles  obtained  by  MEM  or  AR  depart  from  the 
desired  values  because  of  undersampling,  image  degradation 
will  result.  Furthermore,  wc  arc  interested  not  only  in  the 
magnitude  of  the  range  profile,  but  also  in  the  phase  of  the 
range  profiles  as  required  for  the  coherent  superposition. 
Therefore,  to  overcome  the  dense  sampling  requirement  and 
retain  the  phase  information  of  the  range  profiles,  it  may  be 
preferable  to  extrapolate  the  data  available  in  the  various 
passbands  into  the  vacant  bands  before  the  spectra  or  range 
profile  are  formed,  and  image  reconstruction  is  then  under¬ 
taken. 
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To  extrapolate  the  data  beyond  the  observed  region,  an 
.  intuitive  way  is  to  predict  the  exterior  data  by  using  the  same 
parameters  obtained  by  the  linear  prediction  model.  One  of  the 
most  popular  appioaches  to  linear  prediction  parameters 
estimation  with  A'data  samples  is  the  Burg  algorithm  [6],  [8], 
For  a  given  number  of  data  samples  in  a  given  observation 
interval,  in  order  to  separate  the  discrete  spectra  (in  this  paper, 
spectrum  is  defined  as  the  Fourier  transform  of  the  observed 
data),  the  required  model  order  in  the  linear  prediction  method 
increases  as  the  separation  of  spectra  decreases,  i.e.,  it  is  easier 
to  model  the  data  sequence  for  spectra  with  larger  separation, 
which  translates  into  well  separate  scattering  centers,  than 
those  with  closer  separation.  In  addition,  for  a  given  model 
order  and  given  number  of  sampling  points,  it  is  easier  to 
distinguish  the  two  close  spectra  components  (scattering 
centers)  by  a  data  set  with  longer  observation  interval  than  that 
by  a  data  set  with  shorter  observation  interval.  It  was  also 
suggested  that  the  model  order  should  not  exceed  half  of  the 
number  of  data  points  for  short  data  segment  because 
otherwise  the  linear  prediction  spectral  estimate  will  exhibit 
spurious  peaks  (6).  From  the  above  observations,  one  can 
conclude  that  it  would  be  more  difficult  to  resolve  two  closer 
point  targets  (Fourier  transform  of  the  observed  data  in 
frequency  domain)  with  short  data  band.  If  all  the  observed 
data  within  multiple  restricted  regions  can  be  fully  utilized, 
better  resolution  can  be  expected. 

In  this  paper,  a  new  iterative  method  which  uses  the  Burg 
algorithm  to  find  the  linear  prediction  parameters  and  an 
iterative  procedure  to  modify  the  prediction  parameters  is 
proposed  and  tested  with  both  simulated  and  realistic  mea¬ 
sured  data  generated  in  our  anechoic  chamber  experimental 
microwave  imaging  and  measurement  facilities.  With  this 
algorithm,  one  can  obtain  acceptable  extrapolation  beyond  the 
r^erved  region  if  the  spectra  are  in  discrete  forms  and  the 
,  aration  of  the  spectra  are  not  too  small.  Both  simulations 
and  experimental  results  are  presented  to  demonstrate  as  an 
example  the  effectiveness  of  the  method  in  microwave 
diversity  radar  imaging. 

II.  The  New  Iterative  Algorithm 

An  approach  to  linear  prediction  parameter  estimation  with 
A'data  samples  {*|,  •••,  ,V\ }  was  introduced  by  Burg  [8]. 
The  linear  prediction  parameters  arc  obtained  by  minimizing 
the  sum  of  the  forward  and  backward  prediction  error  energies 
(p. 

=  £  l<Vn|2+  £  \bpr.\\  (1) 

n  -  p  n  -  p 

subject  to  the  constraint  that  the  prediction  parameters  satisfy  a 
recursion  relationship  [5],  epn  is  the  forward  prediction  error 
with  model  order  p  and  is  given  by 

P 

?pn  @pkXn  k  »  (2) 

k  -  0 

and  bpn  is  the  backward  prediction  error  with  model  order  p 
and  is  given  by 

r 

bpn  =  £  a*kXn  p*k-  (3) 

k  --  0 


t 


o 

Fig  I.  Available  data  in  multiple  regions.  Passband  (shaded  region) 
surrounded  by  vacant  bands 

apk  are  called  linear  prediction  parameters, and  the  asterisk 
denotes  the  complex  conjugate  operation. 

If  one  is  going  to  extrapolate  from  the  available  data  beyond 
the  observed  region,  a  straightforward  way  is  to  use  the 
estimated  prediction  parameters  apk  and  the  measured  data  by 
the  following  equations: 

p 

*v,,=  £  ar*xs*j  *•  j>0'  (4) 

*  =  I 

apkX ->♦*•  ^>0’  (5) 

k  -  I 

where  the  caret  denotes  the  estimated  value. 

If  the  data  available  are  confined  to  multiple  separate 
spectral  regions  or  passbands  of  equal  width  as  illustrated  in 
Fig.  1.  and  one  tries  to  extrapolate  from  the  observed  data  to 
the  vacant  bands,  an  intuitive  method  is  to  divide  the  inner 
vacant  band  into  two  parts  of  equal  width  and  to  extrapolate 
into  the  left  part  by  using  the  prediction  parameters  obtained 
from  the  data  set  of  region  I  and  extrapolate  into  the  right  part 
by  using  the  model  parameters  obtained  from  the  data  set  of 
region  II. 

If  the  data  sequence  can  be  correctly  expressed  by  the 
prediction  parameters,  then  the  extrapolation  error,  which  is 
defined  as  the  absolute  value  of  the  complex  difference 
between  the  actual  values  (either  computer  generated  or 
measured  values)  and  extrapolated  values,  would  be  very 
small.  However,  if  the  prediction  parameters  cannot  model  the 
sequence  correctly,  the  error  of  extrapolation  may  accumulate. 
We  have  found  that  the  linear  prediction  model  which 
characterizes  the  data  sequence  is  more  accurate  for  longer 
data  strings  and  larger  model  orders,  especially  in  the  presence 
of  noise.  However,  the  model  order  should  not  exceed  half  the 
number  of  samples  because  the  estimated  spectrum  will 
produce  spurious  peaks  [f.,. 

In  order  to  utilize  the  information  available  in  different 
regions,  a  new  iterative  algorithm  using  the  Burg  algorithm  to 
estimate  the  prediction  parameters  and  an  iterative  procedure 
is  proposed.  The  procedure  illustrated  in  Fig.  2  is  as  follows. 

1)  Divide  the  inner  vacant  band  into  two  parts  of  equal 
width.  Extrapolate  into  the  left  part  by  using  the 
prediction  parameters  obtained  from  the  data  set  of 
region  I  and  extrapolate  into  the  right  part  by  using  the 
prediction  parameters  obtained  from  the  data  set  of 
region  II.  If  the  bands  are  not  equal  in  width,  unequal 
division  of  the  vacant  intervening  bands  may  be  appro¬ 
priate. 
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ITERATION  ALGORITHM 


1.  Available  data  in  regions  I  6  11 

are  used  to  extrapolate  into  regions 
III  6  IV. 


2.  Use  data  in  regions  I  +  II  +  III  +  IV 
to  estimate  the  parameters  {aj}i, 
where  i  represents  the  iteration 
number . 


3.  Use  data  in  1  and  (aj)i  to  extrapolate 
into  region  III.  Use  data  in  II  and 
{aj}i  to  extrapolate  into  region  IV. 


Convergence  Test 

a.  Use  data  in  III  and  {a^H  to  estimate 
new  data  values  in  region  I.  Use  IV 
and  {aj)i  to  estimate  new  data  values 
in  region  II. 


b.  Calculate  error 


Ei + 


n 


c.  For  the  resultant  data  Is  In  step  3: 
If  ej  <  ej.j,  1+1  +  1 ,  go  from  step 
3  to  2  otherwise  Iteration  stopped. 


Fig  2  Schematic  diagram  of  the  proposed  new  iterative  extrapolation  method. 


2)  With  the  vacant  bands’  data  together  with  the  observed 
data,  use  the  Burg  algorithm  to  find  a  new  set  of 
prediction  parameters. 

3)  Using  this  set  of  prediction  parameters  and  the  data  of 
region  I,  extrapolate  into  the  left  part  of  vacant  bands, 
and  using  the  same  set  of  prediction  parameters  and  the 
data  in  region  II,  extrapolate  into  the  right  part  of  the 
vacant  bands. 

4)  Using  this  set  of  parameters  together  with  the  extrapo¬ 
lated  data,  estimate  the  data  in  the  observation  region  I 
and  II.  Calculate  the  error  energy  between  the  measured 
data  and  the  estimated  data  in  the  observation  regions. 
The  error  energy  is  denoted  by  E ,  and  is  given  by 

£i  =  £  |xr,~i,|2 3+  \x,  -x[  |2 *  =  Y,  \e‘\2+\b'\1<  <6> 

where  x,  arc  the  measured  data,  x ,  are  the  forward 


estimation  of  x„  x{  are  the  backward  estimation  of  xr„  e, 
is  the  forward  prediction  error,  and  b,  is  the  backward 
prediction  error. 

5)  With  the  measured  data  together  with  the  estimated 
vacant  bands  data,  use  the  Burg  algorithm  to  find  a  new 
set  of  prediction  parameters.  From  the  measured  data 
and  this  new  set  of  prediction  parameters,  extrapolate  the 
vacant  bands’  data  as  described  in  step  3. 

6)  Use  the  same  procedure  of  step  4  to  calculate  the  new 
error  energy  of  the  passbands,  call  it  £2. 

7)  Compare  £t  with  Ei,  if  £2  is  smaller  than  £,,  replace  the 
error  energy  £,  by  £2,  repeat  step  5. 

8)  If  £2  is  greater  than  £,,  stop  the  iteration,  and  take  the 
extrapolated  data  of  the  previous  loop  as  the  final  result. 

In  step  1,  if  the  w  idth  of  a  single  band  (band  I  and/or  band 
ID  is  not  large  enough,  the  extrapolation  errors  produced  by 
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the  prediction  parameters  obtained  from  single  passband  data 
may  be  very  large,  in  that  case,  we  can  set  the  data  in  the 
vacant  bands  to  zero. 

The  above  iterative  method  can  be  easily  applied  to  the  case 
where  only  one  single  data  band  is  available.  The  procedures 
are  almost  the  same  except  that  only  one  data  sequence  is  used 
to  extrapolate  to  the  exterior  bands  and  to  calculate  the 
extrapolation  errors. 

III.  Scattering  Properties  of  a  Metallic  Object 

In  this  section  we  shall  show  that  under  the  high  frequency 
approximation  the  scattered  fields  of  a  metallic  object  can  be 
expressed  as  superposition  of  scattered  fields  of  discrete 
scattering  centers.  These  phenomena  allow  us  to  apply  the 
proposed  extrapolation  algorithm  to  radar  imaging. 

For  a  metallic  object  large  compared  with  wavelength,  the 
scattering  mechanism  can  be  divided  into  the  following 
components  [9]. 

1)  specular  scattering  points; 

2)  scattering  from  surface  discontinuity:  edges,  comers, 
tips  etc.; 

3)  scattering  from  surface  derivative  discontinuities; 

4)  creeping  waves; 

5)  traveling  waves; 

6)  scattering  from  concave  regions; 

7)  multiple  scattering  points. 

For  most  situations,  the  major  contributions  to  the  scattered 
waves  are  ascribed  to  the  specular  scattering  points  and  edge 
diffractions. 

Consider  a  metallic  object  seated  on  a  rotated  pedestal  and 
illuminated  by  a  plane  wave  as  shown  in  Fig.  3.  The  distance 
between  the  rotation  center  O  and  the  transmitter  and  receiver 
are  R,  and  Rr,  respectively,  and  the  unit  vectors  in  the 
directions  of  transmitter  and  receiver  are  I,  and  4.  respec¬ 
tively.  Under  the  physical  optics  and  Bom  approximations,  the 
vector  potential  at  the  receiver  under  the  far-field  condition 
can  be  expressed  as  [1] 

%(k,  4,  4)  =  ~r  e  JkR'  f  2 n(r') 

4xRr  Js,„ 

x  f)0(k)e,ki,r'  f,)  r']  dS' ,  (7) 


Fig.  3.  Geometry  of  the  scattering  measurement  system. 


where  d/da'  is  the  derivative  with  respect  to  the  surface 
curvature.  The  points  F'  corresponding  to  the  solutions  of  (1 1) 
are  called  stationary  points  or  equiphase  points  or  the 
scattering  centers,  the  term  fi(?' )  x  fi^k)/yfSj  =  is 
called  the  scattering  strength  for  the  particular  scattering 
center  at  7j .  It  is  seen  that  the  locations  of  the  scattering 
centers  depend  on  the  directions  of  4,  f,  as  well  as  the  shape  of 
the  metallic  surface.  The  scattering  strength  depends  on  the 
local  properties  of  the  scattering  centers.  The  above  analysis 
illustrates  that  the  object  function  we  would  be  dealing  with  in 
high-frequency  radar  imaging  are  of  discrete  form  consisting 
of  point  scattering  centers. 

If  the  received  scattered  fields  have  been  calibrated  with  a 
reference  target  [1],  the  corrected  vector  potential  can  be 
expressed  as 


where  k  is  the  wavenumber,  S,n  the  illuminated  region,  n(T) 
the  unit  normal  vector  at  the  surface  point  F' ,  and  /? o{k)  the 
incident  magnetic  field  at  the  rotation  center.  The  scattered 
field  is  related  to  the  vector  potential  by 

P.Ak,  4.  fr)=jo>Ar(k,  r„  4)  (8) 

where  A  r  is  the  transverse  component  of  A  along  the  direction 

4 

As  k  approaches  infinity,  the  asymptotic  expression  of  the 
above  equation  can  be  obtained  by  applying  the  stationary 
phase  method  [8|  to  (8).  The  result  is 


A{k,  4,  4)  =  2  (12) 

j 

The  Fourier  transform  of  (12)  will  give  the  range  profile  and 
scattering  strength  of  the  scattering  centers. 

IV.  Results 

In  this  section,  the  performance  of  the  proposed  new 
algorithm  using  both  simulated  and  realistic  data  will  be 
evaluated.  First,  assume  for  simplicity  an  object  consisting  of 
n  point  scatterers  located  at  (r0  +  yt)  is  illuminated  by  a  plane 
wave,  where  r0  is  the  distance  between  the  transmitter/receiver 
and  a  reference  point  of  the  object  and  y,  is  the  differential 
range  of  the  yth  scatterer  (range  relative  to  r0).  Under  far  field 
condition  and  ignoring  multiple  scattering,  and  considering  for 
simplicity  a  scalarized  version  of  (12),  the  corrected  scalar 
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field  can  be  expressed  as 

E's  (k)  =  ^  atejlkvj.  (13) 

J 

In  the  following  simulation,  the  theoretical  values  of  E's(k) 
are  calculated  in  200  equally  spaced  frequency  steps  covering 
the  frequency  range  /,  =  6  GHz  to  /mo  =  16  GHz,  with  an 
assumed  signal  to  noise  ratio  set  to  40  dB.  These  values 
anticipate  the  realistic  experimental  data  utilized  in  testing  the 
algorithm. 

Assuming  the  available  (computed)  data  are  in  the  follow  ing 
passband  (/jo,/80)  and  (/im./po)-  We  want  ,0  extrapolate  the 
data  to  the  vacant  bands  (/i,/»).  (/si. /in),  and  (/m./too)- 
The  range  resolution  obtained  by  the  discrete  Fourier  trans¬ 
form  (DFT)  method  using  the  whole  bandwidth  (/,  f20 o)  is 
about  1.5  cm.  The  resolution  using  a  single  frequency  band  is 
about  5.5  cm.  The  resolution  using  both  frequency  bands  is 
about  2.0  cm,  however,  very  high  sidelobe  levels  will  be 
produced.  We  consider  an  object  consisting  of  seven  point 
scattercrs,  the  location  and  scattering  strength  for  each  point 
scatterer  are  (r,  =  -  30  cm,  a,  =  0.5),  (r2  =  -20  cm ,  a2  = 
0.5),  (r3  =  -  10  cm,  a}  =  0.5),  (r4  =  -  2  cm,  a4  =  1),  (r5 
=  10  cm,  a5  =  0.25),  (r6  =  20  cm,  a6  =  0.25),  (r7  =  -  30 
cm,  a2  =  0.25).  The  values  of  the  field  at  each  sampled 
frequency  /  are  calculated  using  (13). 

Define  the  extrapolation  error  at  frequency  /  as 

«</,)= \e;  (/,)- £;</,>!.  04) 

where  (/)  is  the  extrapolated  value  at /.  The  extrapolation 
errors  for  different  algorithms  are  compared  and  shown  in  Fig. 
4(a).  The  bold  solid  curve  is  the  amplitude  of  the  theoretically 
computed  fields  £/(/),  the  thin  solid  curves  are  the 
extrapolation  error  after  100  iterations  using  the  algorithm 
proposed  in  [3],  the  dashed  curves  are  obtained  by  using  the 
Burg  algorithm  to  find  the  prediction  parameters  from  the 
respective  passband,  and  using  this  set  of  parameters  together 
with  data  in  each  passband  to  extrapolate  to  the  outside  regions 
(bands  111  and  IV).  The  dotted  line  curves  are  obtained  using 
this  new  algorithm  with  one  iteration  and  with  model  order  25. 
The  algorithm  proposed  in  [3)  basically  involves  application  of 
the  Gcrchberg  algorithm  to  data  in  the  multiple  restricted 
regions.  However,  no  numerical  or  experimental  results  are 
given  in  that  paper.  It  is  clear  from  the  results  obtained  here 
that  the  algorithm  in  (3)  seems  not  to  be  effective  in  the  case 
considered  as  the  errors  can  exceed  the  amplitude  of  the 
theoretical  fields.  Extrapolation  from  single  passband  is  not 
good  in  this  example,  because  the  model  order  is  not  sufficient 
to  model  the  data  series  in  the  presence  of  noise.  The  proposed 
new  method  after  one  iteration  is  seen  to  produce  small  error. 

The  Fourier  transform  (FT)  of  the  all-band  data  (i.e.,  data  in 
region  I  to  IV),  passband  data  only,  passband  plus  extrapolated 
data  with  the  new  proposed  method  are  shown  in  Figs.  4(b)- 
4(c),  respectively.  Note  that  the  FT  of  spectral  data  yields 
range  profile  of  the  scattering  object.  It  is  dear  that  FT  using 
passband  data  only  (Fig.  4(c))  has  very  high  sidelobe 
structure,  the  FT  of  the  extrapolated  data  using  the  algorithm 
in  [3]  (Fig  4(e))  is  totally  different  from  the  original  of  Fig. 
4(b)  The  result  obtained  by  Fourier  transforming  the  data 


DISTANCE  (cm) 

(4)  (r) 

Fig  4  (a)  Magnitude  of  theoretical  Helds  and  comparison  of  exlrapolation 

errors  of  different  methods, /,  =  fiGHz./jn,  =  16  GHz  -  magnitude 

of  theoretical  Fields;  — :  extrapolation  error  from  a  single  passband.  no 
iteration;  ....:  extrapolation  errors  from  new  iterative  algorithm; 
extrapolation  errors  from  algorithm  proposed  in  |3).  (b)  FFT  of  the  whole 
band  data  (c)  FFT  of  the  passband  data  (d)  FFT  of  the  passband  and 
extrapolated  data  with  one  iteration  (e)  FFT  of  the  passband  and 
extrapolated  data  using  algorithm  proposed  in  |3J 


generated  by  the  proposed  algorithm  is  shown  in  Fig.  4(d) 
which  exhibits  excellent  agreement  with  the  all-band  result  of 
Fig.  4(b).  The  magnitudes  of  the  peaks  in  Figs.  4(b)  and  4(d) 
depart  from  the  original  assigned  values  because  of  zero 
padding  used  in  the  fast  Fourier  transform  (FFT)  algorithm. 
This  lack  of  fidelity  in  scattering  strength  reconstruction  does 
not  have  a  discernible  degrading  effect  on  the  quality  of  image 
reconstructed  as  w  ill  be  illustrated  below,  but  is  important  and 
must  be  dealt  with  when  quantitative  analysis  of  scattering 
strengths  is  needed. 

If  the  frequency  coverage  is  increased  to/,  =6  GHz, /mo 
=  20  GHz)  with  the  number  of  sampling  points  being  fixed  to 
200  and  the  passbands  are  kept  at  (/jo./so)  and  (/i;o, /po).  the 
computed  fields  and  the  extrapolation  errors  would  be  as 
shown  in  Fig  5.  It  is  seen  that  the  extrapolation  error  indicated 
by  the  dashed  line  becomes  smaller  If  the  frequency  coverage 
is  decreased  to  ( /i  =  6  GHz, /mo  =  12  GHz),  the  results  would 
be  as  shown  in  Fig.  6.  It  is  seen  that  the  extrapolation  errors 
indicated  by  the  dashed  and  dotted  curves  arc  now  both  high. 
The  FFT  of  the  whole  band  data,  passband  data  only,  and  the 
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SAMPLE  POINT 


Fig  5.  Magnitude  of  theoretical  fields  and  comparison  of  extrapolation 

errors  with  and  without  iteration,  /,  =  6  GHZ,  /■&  -  20  GHz.  - : 

magnitude  of  theoretical  fields.  — :  extrapolation  error  from  respective 
passband,  no  iteration;  . extrapolation  errors  from  new  iterative 
algorithm 


Fig  6  (a)  Magnitude  of  theoretical  fields  and  comparison  of  extrapolation 

errors  with  and  without  iteration,  /,  =  6  GHz,  /ao  =  12  GHz.  - : 

magnitude  of  theoretical  fields;  — extrapolation  error  from  respective 
passband.  no  iteration;  . ...:  extrapolation  errors  from  new  iterative 
algorithm  (b)  FFT  of  the  whole  band  data  (c)  FFT  of  the  passband  data, 
(d)  FFT  of  the  passband  and  extrapolated  data  with  one  iteration. 


extrapolated  plus  passband  data  using  this  method  are  shown 
in  Figs.  6(b)-6(d),  respectively.  The  results  in  Figs.  5  and  6 
indicate  the  desirability  of  using  segmented  spectral  data 
spanning  wider  spectral  ranges. 

Although  the  above  algorithm  is  an  iterative  one,  it  was 
found  that  extrapolation  errors  usually  decrease  significantly 
after  the  first  iteration,  and  further  iterations  do  not  seem  to 
improve  the  results.  Therefore,  it  is  practical  and  frequently 
sufficient  to  use  only  one  iteration. 

The  performance  of  the  algorithm  using  realistic  data  is  also 
evaluated.  The  test  object,  a  metalized  100: 1  scale  model  of  a 
B-52  aircraft  with  79-cm  wing  span  and  68-cm  long  fuselage 
was  mounted  on  a  computer-controlled  clevation-ovcr  azi- 
muth  positioner  situated  in  an  anechoic  chamber  environment. 
Two  hundred  and  one  equal  frequency  steps  covering  the  /  = 
6.1  to  /20 1  =  17.5  GHz  range  were  used  to  obtain  the 
frequency  response  of  the  object  as  described  in  [  1  j.  The  target 


DISTANCE  (cm) 

(b)  (c)  «b 


Fig.  7.  (a)  Magnilude  of  the  measured  fields  and  comparison  of  extrapola¬ 
tion  errors  without  and  with  on  iteration.  - :  magnitude  of  theoretical 

fields;  — extrapolation  error  from  respective  passband,  no  iteration;  ..... 
extrapolation  errors  from  new  iterative  algorithm  (b)  FFT  of  the  whole 
band  data,  (c)  FFT  of  the  passband  data,  (d)  FFT  of  the  passband  and 
extrapolated  data  with  one  iteration. 


is  positioned  for  a  fixed  elevation  angle  of  30°  while  the 
azimuth  angle  was  altered  between  0°  and  90*  in  steps  of  0.7° 
for  a  total  of  128  angular  looks. 

The  passband  is  first  defined  as  (/30,  /go)  and  (/,2 0,  /no)- 
The  measured  values  and  the  extrapolated  errors  of  the 
broadside  look  which  is  90°  from  the  head-on  look  are  shown 
in  Fig.  7(a).  The  solid  line  curve  is  the  amplitude  of  the  range- 
phase  corrected  field  (see  [1]).  The  dashed  curve  represents 
the  extrapolation  error  resulting  from  extrapolating  from  each 
single  band  (bands  I,  II)  with  model  order  25  as  described  in 
step  1  of  the  proposed  algorithm.  The  dotted  line  curves  are 
obtained  using  the  new  algorithm  with  one  iteration  and  model 
order  25.  The  extrapolation  error  for  measurement  is  defined 
in  a  manner  similar  to  the  definition  of  error  in  numerical 
simulation  as  the  magnitude  of  the  difference  between  the 
corrected  measured  fields  and  extrapolated  fields.  The  Fourier 
transform  from  the  whole  band  data,  the  passband  data  only, 
and  the  passband  together  with  extrapolated  data  are  shown  in 
Tigs.  7(b),  7(c),  and  7(d)  respectively.  Fourier  transform  of 
the  corrected  scattered  fields  will  give  the  range  profile  of  the 
target  in  that  view.  In  this  figure,  it  is  seen  that  the 
extrapolation  errors  do  not  improve  after  one  iteration.  The 
reason  can  be  explained  from  the  plot  of  the  range  profile 
shown  in  Fig.  7(b).  In  this  view  direction,  the  major 
contributions  to  the  scattered  fields  are  due  to  fuselage  and 
primarily  those  engines  and  fuel  tank  which  are  on  the 
illuminated  side.  Specular  scattering  from  these  points  are  well 
separated  in  time  or  distance  and  their  number  is  small.  Hence 
the  linear  prediction  parameters  obtained  from  single  passband 
arc  sufficient  to  model  the  data  sequence.  The  extrapolation 
errors  are  not  as  small  as  those  obtained  by  simulations.  The 
reason  of  this  is  that  the  applicability  of  linear  prediction 
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<e>  (f)  (g, 


Fig  8  Reconstructed  images  of  the  metalled  scale  model  B-52  aircraft  using  an  angular  window  90°  extending  from  head-on  to 
broadside  in  128  looks  and  different  spectral  coverage  Reconstructions  from:  lai  Enure  bandwidth  ( /,,/. m).  lb)  Passband  (/,„. 

(/i;n,/;-„)  (ci  Passband  (/jo./»).  (/iw./ru)  and  extrapolation  data  (extrapolated  data  into  empts  bands)  without  iteration  (d) 
Passband  (/;„,  /«,),  (/,».  /-„)  and  extrapolation  data  with  one  iteration  (ei  Passband  (/»«,  /on),  (f)  Passband  (/„,.  /on)  and 
extrapolation  data  without  iteration  (gi  Passband  ( /*.,/oo)  and  extrapolation  data  with  one  iteration. 


model  to  the  extrapolation  of  scattered  fields  of  a  metallic 
object  is  based  on  the  high-frequency  approximation.  In  the 
measurement  data,  however,  polarization  effects,  edge  dif¬ 
fraction.  multiple  scattering  and  the  failure  to  satisfy  the  high 
frequency  approximation  in  the  lower  region  of  the  frequency 
band  utilized  in  the  measurement  will  degrade  the  performance 
of  the  algorithm. 

The  reconstructed  images  of  the  test  object  using  data 
collected  in  an  angular  windows  of  90°  extending  from  head- 
on  to  broadside  in  128  looks  (see  [1|  for  details)  and  different 
frequency  bands  are  shown  in  Fig.  8  The  transmitting  antenna 
is  right-hand  circularly  polarized  and  the  receiving  antenna  is 
left-hand  circularly  polarized,  which  constitutes  by  the  con¬ 
vention  given  in  [9]  a  co-poiarized  transmitting/receiving 
system.  Fig.  8(a)  is  obtained  by  using  the  whole  bund  data. 
Fig.  8(b)  is  obtained  by  using  the  passband  data  alone  Fig 
8(c)  and  8(d)  are  obtained  by  extrapolating  without  iteration 
and  after  one  iteration,  respectively  The  model  order  used  is 
M  =  25  in  both  cases 

If  the  passband  is  defined  as  ( /<,<,  /n„),  the  reconstructed 
images  obtained  by  using  the  passband  data  alone  and  by 
extrapolation  without  iteration  and  after  one  iteration  are 


shown  in  Figs.  8(e).  8(f).  and  8(g),  respectively.  The  model 
order  used  is  also  M  =  25. 

It  is  seen  that  the  image  quality  of  Figs,  8(c)  and  8(e)  are  as 
good  as  that  of  Fig  8(a).  These  results  show  the  effectiveness 
of  the  application  of  the  proposed  algorithm  to  radar  imaging 
from  segmented  data  bands. 

V.  Conclusion 

A  new  method  employing  the  Burg  algorithm  and  an 
iterative  procedure  to  extrapolate  observed  data  beyond 
restricted  regions  of  observation  has  been  proposed  and  tested. 
Simulation  and  experimental  results  prove  the  effectiveness  of 
this  proposed  method.  The  algorithm  is  especially  effective 
when  the  spectra  of  the  collected  data  (the  object  range  profile 
in  this  case)  are  in  discrete  form.  Possible  applications  of  this 
new  method  can  be  found  in  diverse  fields  whenever  the  data  is 
available  in  restricted  bands.  For  example,  in  multiple  band 
microwave  imaging  system,  the  quality  of  the  image  obtained 
by  extrapolating  from  a  much  smaller  bandwidth  can  be  as 
good  as  that  obtained  by  data  in  the  full  bandw  idth.  The  cost  of 
the  imaging  system  can  hence  be  reduced  drastically  as  the 
cost  of  the  required  gear  can  he  much  lower  than  the  cost  of  the 
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gear  to  cover  the  full  bandwidth  and  restrictions  on  use  of 
frequency  bands  can  be  accommodated 
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Radar  targets  can  be  identified  by  either  forming  images  with 
sufficient  resolution  to  be  recognized  by  the  human  observer  or 
by  forming  signatures  or  representations  of  the  target  tor  auto¬ 
mated  machine  recognition.  Tomographic  Microwave  Diversity 
Imaging  techniques  that  combine  angular  taspect).  spectral,  and 
polarization  degrees  of  freedom  hace  been  shown.  as  summarized 
in  the  first  part  of  this  paper,  to  be  callable  of  producing  images  of 
the  scattering  centers  of  a  target  with  near  optical  resolution. 
Despite  this  capability  there  are  circumstances  when  the  size  and ! 
or  cost  of  the  physical  aperture  needed  to  furnish  angular  degrees 
of  freedom  is  too  high,  or  when  the  time  delay  mvolv ed  in  syn¬ 
thesizing  such  an  aperture  through  relative  motion  between  the 
radar  system  and  the  ob/ccl  being  imaged  /as,  for  example,  in  SAR 
and  ISARj  is  not  acceptable  One  is  faced  then  with  the  problem 
of  having  to  identity  the  target  from  a  limited  amount  of  informa¬ 
tion  that  is  insufficient  to  produce  an  identifiable  image  We  show 
that  collective  nonlinear  signal  processing  based  on  models  of 
neural  networks  combined  with  the  use  of  suitable  target  signa¬ 
tures  there  sinogram  representations I  offer  the  promise  of  robust 
super  resolved  target  identification  from  partial  information 
Results  presented  are  of  numerical  simulations  for  a  neuromorphic 
processor  where  the  neural  net  performs  simultaneously  'he  func¬ 
tions  of  data  storage,  processing,  and  recognition  by  automatically 
generating  an  identifying  ob/ect  label,  and  last  optoelectronic 
architectures  and  hardware  implementations  are  briefly  men¬ 
tioned  Correct  identification  from  as  loss  as  10  percent  of  the  full 
sinogram  representations  derived  from  teal  data  collected  in  an 
anechoic  chamber  environment  for  three  test  targets  (scale  models 
of  8-52,  AW  AC,  and  Space  Shuttlel  and  taught  to  the  network  is 
demonstrated  Practical  considerations  and  extensions  to  real  sys¬ 
tems  are  briefly  discussed.  The  neuromorphic  approach  to  target 
identification  introduced  here  has  the  promise  of  obviating  the 
need  for  large  costly  apertures  that  are  needed  for  the  imaging  of 
remote  targets  It  also  suggests  that  nonlinear  multidimensional 
dynamic  al  systems  may  prov  ide  an  avenue  to  the  problem  of  target 
identification  from  a  single  wide  band  radar  ec  ho 

I.  Introduction 

There  are  two  distinct  approaches  to  radar  target  iden- 
tification.  One  is  microwave  image  Formation  followed  by 
recognition  and  identification  by  a  human  observer,  i.e.,  by 
the  eye-brain  system.  Etere,  one  is  concerned  with  con- 
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cepts  and  methodologies  for  endowing  the  images  formed 
with  the  highest  resolution  possible  to  facilitate  their  reli¬ 
able  identification  by  human  observers.  Near-optical  res¬ 
olution  and  c  ost  effectiveness  are  usually  the  objectiv  e.  The 
second  approach  is  automated  recognition  of  the  target  by 
a  machine  using  suitable  target  signatures  or  representa¬ 
tions.  This  approac  h  is  cal  led  for  when  we  do  not  have  suf¬ 
ficient  information  about  the  Radar  Cross  Section  (RCS)  of 
the  various  parts  of  the  ob|ect  to  be  able  to  define  it.  Here, 
one  is  concerned  with  issues  of  correct  identification  given 
partial  or  sketchy  information  irrespective  of  range  or  aspect 
of  the  target  or  its  location  within  the  field  of  view  with  the 
help  of  systems  that  can  do  this  in  robust  and  fault-tolerant 
manner  In  this  second  approach,  the  processing  carried 
out  by  the  eye-brain  system  in  identifying  the  image  in  the 
first  approach  is  to  be  mimicked  by  a  machine.  The  motives 
for  automated  recognition  are  varied  with  speed  and  cost 
effectiveness  ranking  high  among  them.  Both  approaches 
involve  amplitude  and  phase  measurements  of  radar  echos 
from  complex-shaped  objects  as  function  of  orientation, 
frequency,  and  polarization  using  the  same  gear  widely 
employed  in  making  complex  RCS  measurements.  In  the 
following,  the  terms  identification  and  recognition  will  be 
used  interchangably. 

In  this  paper,  we  discuss  both  approaches  described 
above  and  show  how  they  are  interrelated  and  how  an 
understanding  of  the  microwave  imaging  process  and  tar¬ 
get  representation  are  required  for  the  formulation  of 
methods  for  automated  target  identification.  We  begin  in 
Section  II  with  a  qualitative  review  of  the  principles  and 
methodologies,  of  tomographic  mu  rovvave  diversity  imag¬ 
ing  extensively  studied  and  developed  in  our  laboratory 
where  it  is  shown  that  microwave  diversity  imaging  pro¬ 
vides  3-D  tomographic  or  projective  images  of  scattering 
objects  vviih  near-optical  resolution  employing  spectral, 
angular,  and  polarization  degrees  of  freedom.  Because  of 
space  limitations,  it  is  not  the  aim  here  to  dwell  at  length 
on  the  principles  and  methodologies  of  microwave  diver¬ 
sity  imaging  which  have  been  adequately  described  in  ear¬ 
lier  publications  [1]-[8],  [  1 1 )- [2 1  ].  Instead,  the  discussion 
here  is  made  intentionally  brief  but  with  suffic  lent  detail  to 
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provide  tin1  bac  kground  tor  ensuing  treatment  ol  auto 
mated  targe  t  identitic  ation.  lies  is  clone  by  bringing  out 
those  attributes  ot  microwave  diversity  imaging  that  are  rel¬ 
evant  to  automated  mac  tune  recognition.  This  is  followed 
in  Section  III  by  a  discussion  of  a  new  approach  to  target 
identification  from  incomplete  or  sketchy  inhumation 
based  on  models  of  neural  networks.  I  he  work  is  motivated 
by  the  desire  to  further  reduc  e  the  projec  ted  <  ost  ot  mil  ro- 
wave  diversity  imaging  systems  and  by  the  fact  that  there 
exist  important  circ  umstances  when  a  real  (physic  al)  or  syn¬ 
thetic  baseline  for  an  imaging  aperture  cannot  be  formed 
because  of  physical  constraints  in  the  former  or  because 
the  time  delay  associated  with  aperture  synthesis  by  target 
motion  in  the  context  of  Inverse  Synthetic  Aperture  Radar 
(ISAR)  is  not  acceptable.  The  aim  is  therefore  to  achieve 
automated  recognition  from  partial  information,  especially 
when  the  amount  of  information  available  about  the  target 
is  so  meager  that  formulation  of  a  recognisable  image  isou' 
of  the  question.  Our  interes*  in  neural  signal  processing  or 
"bram-like"  processing  is  readily  appreciated  when  one 
notes  the  associative  memory  attributes  of  the  eye-brain 
system,  its  amazing  ability  at  supplementing  or  completing 
missing  information,  and  the1  apparent  ease  and  speed  with 
which  it  solves  ill-posed  problems  of  the  type-  encountered 
in  vision,  speech,  and  cognition  in  general.  Neural  pro¬ 
cessing  furnishes  a  new  powerful  approach  to  signal  pro¬ 
cessing  that  is  both  robust  and  fault  tolerant  and  can  he 
extremely  fast  when  implemented  optoelectronically  in 
order  to  fully  exploit  the  fit  between  what  neural  models 
can  offer  (powerful  collective,  nonlinear,  and  iterative 
(dynamical)  processing)  and  what  optics  can  offer  (paral¬ 
lelism  and  massive  mterc onnectivity)  (9),  (10).  The  discus¬ 
sion  in  Section  III  includes  descriptions  of  neural-net 
models  and  refers  to  optoelectronic  architectures  for  real- 
izingcontent-addressableassociafivememones  that  can  be 
useful  in  radar  target  recognition.  Results  representing  the 
performance  of  software  implementation  of  such  neural 
processors  in  the  recognition  of  scale  models  of  aerospace 
targets  employing  sinogram  representations  are  given.  The 
sinogram  representation  is  chosen  as  an  example  of  a  target 
representation  (feature  space  or  signature  space)  that  is 
suitable  for  use  with  neural  processors.  Other  represen¬ 
tations  volving  low-frequency  polarization  maps,  eg., 
plots  of  the  state  of  polarization  of  the  scattered  field  as  a 
function  of  frequency  on  an  inclination  angle  versus  ellip- 
ticity  angle  Cartesian  coordinate  plane,  and  pole-residue 
representation  (29J  of  the  scattered  field,  can  be  equally 
considered. 

Machine  recognition  with  artificial  neural  networks  relies 
therefore  on  the  generation  of  target  signatures  (represen¬ 
tations  of  target  features  or  attributes)  that  tan  lead  to  "dis¬ 
tortion  tolerant"  recognition,  i.e.,  recognition  irrespective 
of  target  range,  orientation,  or  location  within  the  field  of 
'  v  traditionally  referred  to  as  scale,  rotation,  and  shift 
invariant  recognition.  The  generation  of  such  representa¬ 
tions  usually  involves  the  same  gear  employed  in  micro- 
wave  ( (ov)  and  millimeter  wave  (mmw)  diversity  imaging  or 
in  performing  R(  S  measurements.  In  fact,  the  sinogram 
representation  contains,  as  will  be  shown  below,  exactly 
the  same  information  c  ontained  in  a  (iw/mmw  image  of  the 
target  except  that  the  information  is  ged  in  a  different 
format  that  is  more  amenable  for  use  in  automated  rec¬ 
ognition  schemes.  The  work  presented  here  shows  that 
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ohjet  ts  tic  mu  partial  information  that  <  an  be  as  low  as  20  to 
10  pel c  ent  of  the  sinogram  representation  is  possible  with 
neural  net  analog  pto<  essors  employing  heterei  assoi  latixe 
stoiage  and  recall  where  the  outcome  is  a  woid  label 
desciibing  the  recognized  object.  Ihe  neural  net  in  tins 
sense  performs  the  functions  of  storage,  processing,  and 
recognition  (labeling)  simultaneously,  the  work  also  sug¬ 
gests  a  possible  approac  h  to  target  identdic  ation  from  a  sin 
gle  broad  band  radar  echo  based  on  nonlinear  dynamical 
system  theory  and  adaptive  learning  which  will  be  briefly 
outlined. 

II.  Microwax i  Divirsiiy  Imac.inc, 

In  this  section,  a  brief  qualitative  outline  of  the  principles, 
methodologies,  and  capabilities  of  microwave  diversity 
imaging  is  presented. 

A  Principles 

Target-shape  estimation  in  the  context  of  inverse  scat¬ 
tering  from  far -field  data  is  a  longstanding  problem  with 
considerable  present-day  interest  that  has  been  studied  by 
many  (see,  for  example,  [2],  1 1),  [1 1]-(19]).  It  c  an  be  shown 
from  basic  electromagnetic  scattering  theory,  assuming  that 
physical  optics  and  Born  approximations  hold,  that 
monostatic  or  bistatic  measurement  of  the  far  field  scat¬ 
tered  by  an  object  as  a  function  of  illuminating  frequency 
and  object  aspect  can  be  used  to  access  the  Fourier  space 
Pip)  of  the  object  s!  altering  -y(r ).  Here,  p  and  r  are  three- 
dimensional  3-D  position  vectors  in  Fourier  space  and 
object  space,  respectively.  The  object-scattering  function 
■>  can  be  loosely  interpreted  to  represent  the  3  D  geomet¬ 
rical  distribution  and  strength  (RCS)of  those  sc  altering  cen¬ 
ters  of  the  object  that  contribute  to  the  measured  field.  The 
Fourier  space-data  manifold  F„,(p)  measured  in  practice  is 
necessarily  of  a  finite  extent  which  depends  on  the  values 
of  p  realized  in  the  measurement.  These  depend  in  turn  on 
geometry  and  on  the  angular  and  spectral  windows  uti¬ 
lized.  It  is  possible  then  to  retrieve  a  diffraction  and  noise- 
limited  version  yd  of  the  object-scattering  functions  by  3-D 
Fourier  inversion  of  Tm.  In  particular,  tomographic  or  pro¬ 
jective  reconstruction  of  yd  based  on  the  projection  slice 
theorem  or  the  Radon  transform  (see,  for  example,  [16])  have 
been  demonstrated  from  computed  12],  [3],  [15],  [16]  and 
experimental  |5]  and  |6]  data.  Image  reconstruction  using 
a  filtered  back-projection  algorithm  has  also  been  dem¬ 
onstrated  [20]  and  shown  to  yield  images  with  equivalent 
quality  to  those  obtained  by  Fourier  inversion. 

Accessing  the  Fourier  space  of  a  scatterer  in  practice  is 
not  direct.  It  requires  preprocessing  of  the  sc  altered  far  field 
one  measures  in  order  to  remove  an  undesirable  phase  fac¬ 
tor  due  to  propagation  between  the  target  and  the  rei  eiver 
and  to  remove  the  effects  of  c  lutter  and  measurement  sys¬ 
tem  response  [6]  [8],  The  range-phase  removal  is  essential 
for  image  reconstruction  and  is  synonomous  with  syn¬ 
thesizing  a  common  phase  reference  or  phase  center  on  the 
target.  It  c  an  be  interpreted  as  a  Target  Derived  Reference 
(TDR)  method  [21]  in  wluc h  the  target  itself  is  made  to  lur- 
nish  in  effec  t  the  reference  phase  tor  the  c  oinplex  field  mea 
siiremonts  at  an  observation  point.  The  vector  nature  of 
el**(  tromagnetic  s<  altering  c  an  be  treated  by  assuming  that 
the  sc  at  ter  mg  matrix  which  c  harai  ier  izes  the’  polarization 
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|>u>|>i  flu's  ul  tin  l.iiget  and  liriii  r  ptowdcs  added  inlur- 
nution.  is  measured  a t  every  frequency  and  aspect  angle 
at  which  thf  scatteimg  tat  get  is  observed.  A  polarisation 
eitliam  i'd  linage  <  an  in  prim  iple  be  obtained  by  ini  oherent 
super  posit  ion  (addition  ot  in  tensities)  of  tlu-  images  formed 
from  the  au  i'"Cil  F miner- space  data  assoc  lated  with  eai  h 
ot  the  tour  c  omponentsof  the  sc  attering  matrix.  In  the  work 
ciesc  ribed  below,  polarisation  enhancement  of  the  image's 
is  ac hieved  by  incoherent  superposition  of  images  derived 
from  only  the  copolarized  and  c  ross-pole  rized  components 
of  the  sc attered  field. 

The  above  concepts  represent  the  basic  principles  on 
which  the  methodologies  of  microwave  diversity  imaging, 
discussed  next,  are  based. 

8.  Methodologies 

In  our  work,  the  Fourier  space  ot  a  scattering  object  is 
accessed  using  an  automated  experimental  radar  sc  attering 
and  microwave  imaging  facility  (see  Fig.  1).  The  facility  en¬ 
ables  accessing  the  Fourier  space  of  scale  models  of  targets 
of  interest  placed  in  an  anechoic  chamber  over  extended 
microwave  windows  (10  MHz-26.5  GHz),  for  any  state  of 
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polarization  ol  the  transmitting  iii id  ret  <  w  ing  a nli-nnas,  and 
for  any  a  spec  t  or  target  view  mg  angle  The  instrumentation 
shown  measures  the  stepped  frequency  response  ot  the 
sc  atterer.  Virtually  any  radar  imaging!  on  figuration  or  inno¬ 
vative  imaging  coni  ept  c  an  be  readily  simulated  i  ost  etlec  ■ 
tively  Inverse  synthetic  aperture  radar  (ISAR),  spot  light 
imaging,  and  arras  imaging  can  all  be  simulated  and  stud¬ 
ied.  Also,  any  illuminating  pulse c  an  be  synthesizer!  by  c  on- 
trollmg  the  amplitude  and  phase  of  the  CVV  signals  used  to 
illuminate  and  acquire  the  stepped  frequency  response  of 
the  target.  In  the  arrangement  shown  in  Fig.  1,  the  trans¬ 
mitting  and  rec  erving  antennas  are  nearly  monostatic,  but 
bistatic  and  multi-static  measurements  can  also  be  per¬ 
formed.  State-of-the-art  microwave  instrumentation  is  used 
to  enable  making  complex  scattered  field  measurements 
with  extremely  high  accuracy  (  ±  0.1  dB,  +  0.5  degree)  over 
a  dynamic  range  of  better  than  80  dB.  Better  accuracy  is 
achieved  by  averaging  several  independent  readings  at  each 
measurement  frequency.  Frequency  can  be  set  automati¬ 
cally  with  an  accuracy  of  better  than  4  Hz  and  with  stability 
of  better  than  240  Hz.  Results  demonstrating  the  capabil¬ 
ities  of  the  facility  in  mic  rowave  diversity  imaging  of  several 
representative  targets  are  shown  in  Figs.  2  and  3.  The  Fou¬ 
rier  slices  shown  in  Fig.  2  consist  u  i  Hr  plots  of  128  fre¬ 
quency  responses  of  the  test  objective  coi  rei  ted  for  range- 
phase  and  system  response  taken  over  an  angular  window 
of  90c  extending  in  azimuth  from  head-on  to  broadside  at 
a  fixed  elevation  angle  8  with  each  view  containing  128  fre¬ 
quency  points.  In  these  polar  plots,  frequency  is  along  the 
radial  direction  and  aspect  (azimuth  angle)  is  in  the  angular 
direction.  Interpolation  of  the  polar  formatted  data  of  a  slice 
onto  a  rectangular  grid  followed  by  Fourier  inversion  yields 
in  accordance  to  the  protection  slice  theorem  [22]  a  pro¬ 
jection  image  ot  the  scattering  centers  of  the  test  ob|ect. 
The  projection  image  represents  the  projection  of  the  scat¬ 
tering  centers  of  the  target  on  a  plane  normal  to  the  azi¬ 
muthal  axis  of  rotation  (plane  parallel  to  the  plane  of  the 
Fourier  slice).  Fourier  inversion  of  the  frequency  response 
for  a  given  viewing  angle  yields  the  complex  impulse 
response  or  complex  range-profile  of  the  target  at  that  angle. 
The  range-profile  resembles  the  echo  or  response  of  the 
target  when  subjected  to  impulsive  plane-wave  illumina¬ 
tion  for  the  given  viewing  angle.  The  complex  nature  of  the 
range-profile  is  caused  by  the  fact  that  only  positive  spectral 
windows  can  be  employed  in  practice.  By  displaying  the 
modulus  of  the  complex  range-profiles  side  by  side  against 
the  azimuthal  angle  of  rotation  <t>,  one  obtains  the  sinogram 
representation  for  a  given  elevation  angle  8  of  the  object. 
Sinograms  are  discussed  further  and  utilized  in  Section  III. 

Fig.  3  shows  examples  of  projection  images  of  two  test 
objects  and  the  process  of  their  enhancement  by  polariza¬ 
tion  diversity  and  symmetrization.  Circularly  polarized 
plane-wave  illumination  was  used,  and  both  the  co-polar- 
ized  and  cross-polarized  components  of  the  scattered  field 
were  measured;  assot  iated  Fourier  space  slices  were 
formed  from  which  images  were  obtained.  It  is  evident  from 
the  images  formed,  for  the  different  polarization  states,  that 
these  contain  some  complementary  information.  There¬ 
fore,  some  image  enhancement  can  be  expected  when  the 
intensities  of  the  co-polan/ed  and  cross-polarized  images 
are  added  as  demons! rated  by  the  images  in  Fig  3(c) 
Bee  ause  manmade  objects  of  interest  in  imaging  radars  are 
invariably  svmmetrical  and  their  plane  or  planes  of  sym- 
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Fig.  2.  Results  of  microwave  diversity  imaging  of  a  (70:1) 
scale  model  of  the  space  shuttle,  (a)  Ob|ect  shown  mounted 
on  azimuth  positioner  (turntable)  at  an  inclination  angle  6 
=  30c,  magnitude  of  (b)  co-polarized  and  (c)  cross-polarized 
Fourier  space  slices  taken,  (d)  Polarization  and  symmetry- 
enhanced  projection  image.  (In  (bland  (c),  radial  coordinate 
represents  frequency  (  and  angular  coordinate  represents 
azimuthal  angle  <4,  (6  <  f<  17)  GHz  in  128  frequency  steps 
and  0  <  0  £  90°  in  128  angle  increments.) 

metry  can  be  inferred  from  their  heading,  symmetrization 
can  be  used  to  enhance  the  image  further.  As  simple  a  con¬ 
cept  as  it  is,  symmetrization  is  a  powerful  tool  developed 
in  our  work  to  exploit  the  afinity  of  the  eye-brain  system 
in  recognizing  symmetric  patterns  (e.g.,  ink  blots  employed 
in  cognitive  experiments).  In  certain  instances,  poor  images 
that  were  hardly  recognizable  became  meaningful  and  rec¬ 
ognizable  after  symmetrization.  Symmetrization  of  the 
polarization  enhanced  images  in  Fig.  3(c)  about  the  vertical 
line  of  symmetry  running  through  the  fuselage  was  per¬ 
formed  digitally  leading  to  the  polarization  and  symmetry 
enhanced  images  shown  in  Fig.  3(d).  The  image  shown  in 
Fig  2(d)  was  polarization  and  symmetry  enhanced  in  the 
fashion  described.  Also,  all  images  shown  were  actually 
magnified  in  the  vertical  direction  by  a  factor  1/cos  6  =  1.155. 
6  being  the  inclination  angle  at  which  data  was  acquired, 
in  order  to  obtain  a  properly  scaled  projection  image  of  the 
scattering  centers  as  they  would  be  seen  for  example,  in  a 
top  view  of  the  test  object  shown  in  Fig.  2(a).  It  is  seen  that 
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Fig.  3.  Examples  of  projection  images  of  two  test  ob|ects 
(a)  Co-polarized  image  (bi  Cross-polarized  image,  (c)  Polar¬ 
ization  enhanced  image,  (d)  Symmetry  enhanced  image. 

features  of  the  test  ob|ects  used  are  delineated  clearly  in 
correct  geometrical  relation  and  relative  size  enabling  quick 
recognition  of  the  scatterer  by  the  eye-brain  system.  The 
image  resolution  achieved  is  of  the  order  of  2  cm  employing 
a  (16-17)-GHz  spectral  window.  It  is  worth  noting  that  all 
images  are  naturally  edge  enhanced  because  of  the  spec¬ 
ular  nature  of  microwave  scattering  from  smooth  flat  sur¬ 
faces  of  the  objects  tested. 

The  quality  and  edge-enhanced  nature  of  the  microwave 
diversity  images  obtained  above  suggest  thev  are  well  suited 
for  automated  pattern  recognition  by  a  machine,  especiallv 
since  the  TDR  technique  results  in  images  that  are  alwavs 
centered  within  the  image  plane.  This  may  be  useful  in  cer¬ 
tain  situations.  But  when  a  human  observer  (the  ultimate 
in  pattern  recognition  systems)  is  available  to  analyze  and 
recognize  the  image,  the  benefits  of  automated  recognition 
of  the  image  become  questionable.  Moreover,  conven¬ 
tional  pattern  recognition  works  best  when  a  good  image 
is  available  and  may  falter  when  the  image  is  incomplete 
or  the  amount  of  available  information  about  the  object  or 
the  target  is  insufficient  for  image  formation.  Of  course,  this 
is  exactly  the  challenge  in  practice,  namely  target  recog¬ 
nition  from  sketchy  (partial  and/or  noisy)  information  which 
when  taken  by  itself  would  not  be  sufficient  to  form  a  rec¬ 
ognizable  image.  What  is  needed  therefore  is  an  automated 
recognition  algorithm,  ot  the  kind  described  below,  that 
can  identify  ob|ects  or  targets  even  when  the  available 
information  is  sketchv. 
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Among  ns  many  astounding  information  processing 
capabilities,  such  as  robustness  and  fault  tolerance,  the 
brain  is  also  able  to  recognize  objects  from  partial  infor¬ 
mation.  We  can  recognize  a  partially  obscured  or  shad¬ 
owed  face  of  an  acquaintance  or  a  mutilated  photograph 
of  someone  we  know  with  little  difficulty,  and  in  reading 
text  we  are  easily  able  to  fill-in  for  misspelled  or  mistyped 
words.  The  same  is  true  with  understanding  spoken  lan¬ 
guage.  The  brain  has  a  knack  for  supplementing  missing 
information.  Capitalizing  on  this  observation  and  on  our 
knowledge  of  neural  models  and  their  collective  compu¬ 
tational  properties,  a  study  of  "neural  processing"  for 
recognizing  microwave  objects  from  partial  information 
was  undertaken.  Details  and  results  are  given  in  the  next 
section. 

III.  Automated  Target  Recognition  Based  on  Models 
of  Neural  Nets 

Neural-net  models  and  their  analogs  furnish  a  new 
approach  to  signal  processing  that  is  nonlinear  collective, 
robust,  and  fault  tolerant.  These  models  are  highly  stylized 
versions  of  biological  neural  nets  in  which  neurons  act  as 
decision  making  elements  and  the  weights  of  intercon¬ 
nections  between  them  represent  the  stored  information 
or  memory.  A  neuron  receives  exitatory  and  inhibitory 
inputs  from  other  neurons  and  decides  to  fire,  sending  its 
own  signal  in  the  form  of  a  train  of  impulses  to  other  neu¬ 
rons,  or  not  to  fire,  depending  on  whether  or  not  the  sum 
of  the  input  signals  to  the  neuron  exceeds  or  not  a  pre¬ 
scribed  threshold.  The  rate  of  firing  (spike  frequency)  as  a 
function  of  the  sum  of  inputs  and  threshold  value  repre¬ 
sents  the  transfer  function  or  response  of  the  neuron.  The 
transfer  function  is  usually  highly  nonlinear,  making  a 
neural  net  in  essence  a  nonlinear  multidimensional  dynam¬ 
ical  system  with  very  rich  phase-space  behavior.  A  step 
function  response  is  assumed  for  the  neurons  in  the  treat¬ 
ment  here  and  discrete  evolution  of  the  state  of  the  net  in 
time,  taken  as  an  iteration  number,  is  adopted.  This  results 
in  a  neural  net  with  binary  neurons  (neurons  firing  or  not 
firing).  The  state  vector  of  such  a  net  consisting  of  say  N 
neurons,  is  represented  then  by  a  point  in  the  N-dimen- 
sional  phase-space  of  the  net  falling  on  the  vertix  of  a  hyper¬ 
cube  and  the  behavior  of  the  net  can  be  visualized  as 
stepped  motion  of  the  state  vectors  in  phase-space  over  the 
verticies  of  the  hypercube.  The  specific  phase-space  tra¬ 
jectory  of  a  net  depends  on  the  weights  or  connectivity 
matrix,  the  neurons  response  and  their  threshold  level,  on 
initial  state  of  the  net,  and  on  any  external  input  signals  the 
neurons  receive  besides  input  signals  from  other  neurons. 
The  recipe  used  below  for  storing  information  in  the  net 
produces  fixed  points  in  phase-space  of  the  net  that  act  as 
attractors  for  initial  states  that  fall  within  their  basins  of 
attraction;  this  operation  represents  the  associative  mem¬ 
ory  or  content  addressable  memory  attribute  of  such  net¬ 
works  and  their  ability  to  supplement  missing  information 
that  will  be  elaborated  on  below.  The  dynamical  phase- 
space  behavior  sketched  above  is  what  distinguishes  the 
neural-net  processing  (neuromorphic  processing)  para¬ 
digm  from  other  approac  hes  to  signal  processing  and  is  the 
underlying  basis  for  the  new  approach  to  target  identifi¬ 
cation  from  partial  information  we  present  in  this  and  sub¬ 
sequent  sections.  In  this  approach,  the  measured  micro¬ 


wave  scattering  data  is  placed  first  in  a  format  suited  for 
neural-net  processing  before  the  above  associative  mem¬ 
ory  function  is  activated,  as  will  be  detailed  later.  Optical 
implementations  of  neural  nets  (see,  for  example,  [9]  and 
110])  are  attractive  because  of  the  inherent  parallelism  and 
massive  interconnection  capabilities  provided  by  optics, 
and  becauseof  emergent  optical  tec  hnologiesthat  promise 
high  resolution  and  high-speed  programmable  spatial  light 
modulators  (SLMs)  and  arrays  of  bistable  optical  devices 
(optical  decision  making  elements)  that  can  facilitate  the 
implementation  and  study  of  large  networks.  Optical 
implementation  of  a  one-dimensional  network  of  32  neu¬ 
rons  exhibiting  robust  content-addressability  and  associ¬ 
ative  recall  has  already  been  demonstrated  to  illustrate  the 
above  advantages  [10).  By  robust  we  mean  fault  tolerance 
and  the  ability  to  correctly  recall  from  partial  input  data 
which  may  alsocontainerrors.  Byone-dimensionalwemean 
that  (in  the  architecture  used  there),  the  neurons  are 
deployed  on  a  line.  Two-dimensional  arrangements  of  neu¬ 
rons  are  also  possible  and  these  are  of  interest  because  they 
are  suitable  for  the  processing  of  2-D  image  data  or  2-D 
object  representations  directly  as  described  below,  and 
offer  a  way  for  optical  implementation  of  denser  networks. 

In  the  remainder  of  this  section,  we  will  discuss  content 
addressable  memory  (CAM)  architectures  based  on  parti¬ 
tioning  of  the  four-dimensional  memory  or  interconnec¬ 
tion  matrix  T,(n,  encountered  in  the  storage  of  2-D  entities. 
A  specific  architecture  and  implementation  based  on  the 
useof  partitioned  unipolar  binary  (u.b.)  memory  matrix  and 
the  use  of  adaptive  thresholding  in  the  feedback  loop  rel¬ 
evant  to  the  treatment  given  below  have  been  described 
elsewhere  [23].  The  use  of  u.b.  memory  masks  greatly  sim¬ 
plifies  optical  implementations  and  facilitates  the  realiza¬ 
tion  of  larger  networks  (103-'t0‘*  neurons).  Numerical  sim¬ 
ulations  of  the  use  of  such  2-D  networks  in  the  recognition 
of  dilute  point-like  objects  similar  to  those  arising  in  radar 
and  other  similar  remote  sensing  imaging  applications  show 
that  dilute  objects  pose  a  problem  for  CAM  storage  because 
of  the  small  Hamming  distance  between  them.  The  Ham¬ 
ming  distance  between  two  binary  vectors  or  matrices  of 
the  same  dimension  is  the  number  of  bits  in  which  they 
differ.  We  show  that  coding  in  the  form  of  a  sinogram  rep¬ 
resentation  or  feature  space  of  the  dilute  object  can  remove 
this  limitation  and  leads  to  recognition  from  partial  versions 
of  the  stored  entities.  The  advantage  of  this  capability  in 
super  resolved  recognition  of  radar  targets,  where  the  prin¬ 
ciples  and  methodologies  of  microwave  diversity  imaging 
described  earlier  are  employed  to  form  sinogram  repre¬ 
sentations  that  are  compatible  with  2-D  CAM  storage  and 
interrogation,  are  discussed.  Super-resolved  automated 
recognition  of  scale  models  of  three  aerospace  objects  from 
partial  information  that  can  be  as  low  as  10  percent  of  a 
learned  entity  is  demonstrated  employing  hetero-associ- 
ative  storage  and  recall  where  the  recognition  outcome  is 
a  word  label  describing  the  recognized  object.  The  treat¬ 
ment  here  is  similar  to  one  we  have  given  elsewhere  [23]. 

A.  Two-Dimensional  Neural  Nets 

S'orage  and  readout  of  2-D  entities  in  a  content  address¬ 
able  or  associative  memory  is  described  next.  Given  a  set 
of  M  2  D  bipolar  binary  [  +  1,  -  1]  patterns  or  entities  vj(ml, 
m  =  1,  2,  •  -  •  M  each  of  N  x  N elements,  i.e.,  N  x  N  bipolar 
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binary  matrices,  these  can  be  stored  in  a  manner  that  is  a 
direct  extension  of  the  outer  product  storage  formula  for 
1-D  entities  [31),  [9),  [10]  as  follows:  For  each  element  of  a 
matrix,  a  new  N  x  N  matrix  is  formed  by  multiplying  the 
value  of  the  element  by  all  elements  of  the  matrix  including 
itself  taking  the  self  product  as  zero.  The  outcome  is  a  new 
set  of  binary  bipolar  matrices  each  of  N  x  N  elements. 
A  formal  description  of  this  operation  is 


-r(m)  _ 


CV*?’ 
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i,  /,  k,  I  =  1,  2,  ■  •  ■  ,  N 
i  =  k,  j  =  I 


(1) 


which  is  a  four-dimensional  matrix.  An  overall  or  composite 
synaptic  matrix  or  connectivity  memory  matrix  is  formed 
then  by  adding  all  4-D  matrices  T i.e., 

T,tll  =  E  C’.  (2) 

m 


This  symmetric  4-D  matrix  has  elements  that  range  in  value 
between  -M  to  M  in  steps  of  two  and  which  assume  values 
of  + 1  and  -1  (and  zeros  for  the  self-product  elements)  when 
the  matrix  is  clipped  or  thresholded,  as  is  usually  preferable 
for  optical  implementations  [10],  [23].  Two-dimensional  uni¬ 
polar  binary  [0  1]  entities  bj™  are  frequently  encountered 
in  practice  (e  g.,  binarized  images  and  object  representa¬ 
tion).  These  can  be  transformed  into  bipolar  binary  matri¬ 
ces  by  forming  v1,™1  =  (2b1™1  -  1),  which  are  then  used  to 
form  the  4-D  connectivity  matrix  or  memory  matrix  as 
described  before.  Also,  as  in  the  1-D  neural-net  case,  the 
prompting  or  initializing  entity  can  be  unipolar  binary 
fa!™1,  which  would  simplify  further  optical  implementations 
in  incoherent  light  [10],  [23]. 

Architectures  for  optical  implementation  of  2-D  neural 
nets  must  contend  with  the  task  of  realizing  a  4-D  memory 
or  interconnectivity  matrix.  Here,  a  scheme  is  presented 
that  is  based  on  partitioning  the  4-D  memory  matrix  into  an 
array  of  N  x  N  2-D  matrices  each  of  which  containing  N  x 
N  elements.  Thus,  a  2-D  neural  net  of  A/  x  A/  =  32  x  32  neu¬ 
rons  would  contain  N *  interconnections,  i.e.,  over  a  million 
interconnections,  which  shows  why  hardware  implemen¬ 
tations  that  use  light  and  optical  interconnections  rather 
than  electronic  interconnects  are  attractive.  Provided  that 
the  number  of  entities  stored  is  not  excessive  (see  below), 
the  4-D  interconnection  matrix  thus  formed  makes  the  sta¬ 
ble  states  of  the  net  (attractors  in  phase-space)  identical  to 
the  entities  stored.  The  maximum  number  of  2-D  entities 
that  can  be  stored  in  this  fashion  without  degradation  of 
recall  is  M  =  N2/8/nN,  which  follows  directly  from  the  stor¬ 
age  capacity  formula  for  the  1-D  neural-net  case  [24],  When 
initiated  from  a  partial  version  of  a  given  state,  the  network 
quickly  converges,  in  a  matter  of  a  few  iterations  (see  below) 
or  time  constants  of  the  "neuron,"  to  the  stored  entity  clos¬ 
est  in  the  Hamming  sense  to  the  initiating  vector  or  matrix. 
This  nearest  neighbor  search  of  the  memory  matrix  for  a 
given  entity  b\"'°'  is  done  by  forming  the  estimate 

N 

b'™'  =  E  T,lklb'kyn]  i.j,  k,l=  1.2,  -■  ,N  (3) 

ci 

followed  by  thresholding  to  obtain  a  new  unipolar  binary 
matrix  which  is  used  to  replace  b‘™]  in  (3);  the  procedure 
is  repeated  to  obtain  a  new  estimate  or  state  matrix.  This 
process  is  repeated  again  and  again  until  the  state  matrix 
or  "vector"  converges  tea  the  stored  entity  closest  to  the 


initializing  matrix  b This  iterative  process  describes 
motion  of  the  state  "vector"  of  this  2-D  net  in  its  phase-space 
and  can  be  viewed  as  a  multidimensional,  nonlinear,  dis¬ 
crete,  dynamical  system  describing  the  net’s  evolution  from 
iteration  to  iteration.  Architectures  for  optoelectronic 
implementation  of  the  auto-assoc  lative  storage  and  recall 
process  described  above  based  on  partitioning  the  4  D 
interconnection  matrix  in  an  array  of  N  x  N  submatrices 
each  of  N  x  N  elements  have  been  described  in  detail  else¬ 
where  [23]. 

B.  Sinogram  Representation  and  Hetero  Assoc/ative 
Storage 

Sinograms  are  object  representations  encountered  in 
tomography  [25],  [26].  In  simple  terms  applicable  to  micro- 
wave  scattering,  the  sinogram  of  a  scattering  object  is  a 
Cartesian  plot  of  the  measured  relative  range  or  differential 
range  of  scattering  centers  on  the  object  versus  aspect 
angle.  A  scattering  center  is  defined  as  any  structural  detail 
on  the  object  that  contributes  to  the  measured  scattered 
field.  In  our  work,  the  sinogram  of  a  target  is  formed  by 
measuring  the  range-profile  or  differential  range  of  the  tar¬ 
get  as  a  function  of  the  aspect  angle  and  fixed  elevation 
angle  8  (see  Fig.  4(c))  and  by  arranging  the  modulus  of  the 
measured  range-profiles  as  vertical  line  intensity  patterns 
side-by-side  as  function  of  aspect  angle  (for  example,  range- 
profiles  versus  azimuthal  angle  <j>  in  Fig.  4(c)  at  fixed  ele¬ 
vation  angle  8).  Sinogram  construction  is  illustrated  in  Fig. 
4  for  a  planar  object  consisting  of  three  points  of  unequal 
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Fig.  4.  Sinogram  representation,  (a)  S<  altering  geometry  tor 
an  idealized  planar  ob|ect  (b)  Sinogram,  (c)  Simplified 
arrangement  for  experimental  generation  of  sinogram 
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strength.  This  object  is  chosen  to  represent  a  highly  sim¬ 
plified  radar  target.  Everv  point  (or  scattering  center)  of  the 
object  generates  a  sinusoidal  trace  in  the  sinogram  whose 
amplitude  is  determined  bv  the  radial  distance  of  that  point 
from  the  center  ot  rotation,  whose  phase  is  determined  by 
the  angular  position  of  that  point,  and  whose  brightness  or 
strength  (represented  in  Fig.  4(b)  by  line  thickness)  is  pro¬ 
portional  to  the  strength  of  the  scattering  center.  Note  that 
scatterer  3,  whose  position  coincides  with  the  center  of 
rotation,  produced  a  zero  amplitude  line  in  the  sinogram. 
A  complete  sinogram  is  produced  by  rotating  the  object 
360°.  It  is  worth  noting  that  the  range-profile  of  an  object 
is  independent  of  its  tar-field  distance  from  the  transmitter/ 
receiver  (T/R)  in  Fig.  4(a)  or  Fig.  4(c).  The  range-profile 
depends,  however,  on  object  aspect  and  on  the  spectral 
window  and  polarization  used  in  data  acquisition  (see  Sec¬ 
tion  ll-B). 

Sinograms  are  particularly  useful  when  the  object  is  point¬ 
like  and  sparse  or  dilute,  as  is  (he  case  in  microwave  diver¬ 
sity  imaging  where  the  images  formed  consist  ordinarily  of 
a  finite  number  of  isolated  scattering  centers.  Given  a  set 
of  2-D  dilute  ob|ects  (each  consisting  of  a  collection  of  a 
finite  number  or  distinct  point  scatterersi  and  their  cor¬ 
responding  set  of  associated  sinogram  representations,  the 
Hammingdistances  between  the  sinogram  representations 
will  always  be  found  to  be  greater  than  the  Hamming  dis¬ 
tances  between  the  objects  themselves.  This  is  assuming 
that  objects  and  sinograms  are  quantized  onto  the  same 
number  and  grid  of  binary  pixels.  The  reason  for  this  is  that 
each  point  of  theob|ect  produces  a  distinct  sinusoidal  trace 
and  thus  spawns  many  points  in  the  sinogram  represen¬ 
tation.  Therefore,  if  (for  example)  two  dilute  objects  differ 
in  only  two  pixels,  their  sinogram  representation  will  differ 
by  two  sinusoidal  traces  and.  hence,  in  many  pixels.  The 
increased  Hamming  distance  makes  it  easier  tor  an  asso¬ 
ciative  memory  to  distinguish  between  the  sinograms  than 
to  distinguish  between  the  objects  themselves.  This  has  the 
added  attraction  of  making  it  less  difficult  to  distinguish 
between  similar  objects,  that  is,  objects  with  small  Ham¬ 
ming  distances  oetween  them.  Sinogram  representations 
also  have  the  advantage  ot  being  useful,  as  will  be  clarified 


below,  in  achieving  distortion  invariant  (scale,  rotation,  and 
translation  invariant)  recognition. 

In  laborator\  work,  the  sinogram  representation  of  a 
complex-shaped  test  object  is  obtained  as  depicted  in  Fig. 
4(c),  which  is  a  highly  simplified  version  of  the  measure¬ 
ment  system  or  Fig.  1.  The  number  of  range-protiles  ,\0 
needed  to  characterize  the  object  and  the  number  of  sam¬ 
ples  NR  within  each  range-profile  are  determined  bv  angu¬ 
lar  and  spectral  sampling  considerations.  Thus,  for  an  object 
with  maximum  extent  L,  the  maximum  number  ot  angular 
samples  in  one  azimuthal  direction  is  M0  =  47 rL  \m,n,  and 
the  maximum  number  of  samples  /\s  within  a  range-profile 
is  \R  =  =  2 Aft  c,  where  Af  is  the  width  of  the  spectral 

window  used,  .V,  is  the  number  of  frequency  points,  c  is  the 
velocity  of  light,  and  \min  is  the  shortest  wavelength  used. 
The  sinogram  ot  an  actual  microwave  target  differs  in 
appearance  from  the  sinogram  of  the  idealized  object 
described  above  in  that  the  intensity  or  brightness  of  its 
sinusoidal  traces  changes  (fades  in  and  oub  with  the  aspect 
or  viewing  angle  because  ot  the  anisotropic  nature  of  the 
scattering  centers  on  actual  targets.  Fig.  5  gives  an  example 
of  the  sinogram  of  a  scale  model  of  a  B-52  test  object  pro¬ 
duced  from  a  slice  of  its  Fourier  space  (shown  in  Fig.  5(a)) 
obtained  at  an  object  inclination  angle  ot  0  =  30  (see  Fig. 
4(c))  employing  the  measurement  facility  of  Fig.  1.  Both 
intensity  and  3-D  perspective  displays  or  the  resulting 
sinogram  are  shown  in  Fig.  5(b)  and  (c),  respectively.  The 
sinogram  shown  demonstrates clearly  how  sinusoidal  traces 
of  the  different  scattering  centers  fade  in  and  out  as  a  func¬ 
tion  of  target  aspect  (here  azimuth  angle  o)  and  how  point¬ 
like  scattering  centers  such  as  the  tips  of  engines  and  fuel 
tanks  (see  the  B-52  part  of  Fig.  3(d))  produce  more  distinct 
traces  than  edge-like  or  extended  scattering  centers  of  the 
target.  Thus,  the  sinogram  pattern  tends  to  characterize  the 
target  by  its  dominant  point-like  scattering  centers  that  are 
visible  over  an  extended  range  ot  aspect  angles.  The  sin¬ 
ogram  pattern  is  a  map  ot  the  measured  relative  positions 
between  such  centers  as  the  target  is  rotated  about  a  spec¬ 
ified  axis.  Complete  sinogram  representation  of  a  3-D  target 
involves  sinogram  maps  such  as  the  one  shown  in  Fig.  5  for 
all  elevation  angles  0  of  expected  encounter.  The  range  of 


broadside 


(ci 


Fig.  5.  Sinogram  of  a  100  1  stale  model  of  a  B  52  test  object,  (a)  Fourier  space  slice  fr< 
which  the  sinogram  is  generated  ibi  Intensity  display  and  (c)  3-D  perspective  displax 
the  sinogram.  Note  '  ' J  J  '  — 1  1  ~ ~ “  — - i  ~'i  i~-  — ' - - '*  ~ 
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azimuth  angles  <f>  needed  would  be  confined  to  those  indi¬ 
cated  by  practical  encounter  scenarios. 

It  is  worth  mentioning  that  the  alignment  of  the  range- 
profiles  to  produce  distinct  sinusoidal  traces  in  the  range- 
aspect  displays  (Fig.  5(b)  and  (c))  is  an  essential  requirement 
for  image  reconstruction  by  back-projection  [20],  The  align¬ 
ment  process  also  defines  the  center  of  rotation  or  phase 
center  of  the  target,  in  that  had  a  point  scatterer  been 
located  at  the  rotation  center,  it  would  produce  a  straight 
line  at  constant  range  in  the  sinogram.  The  alignment  is  also 
equivalent  to  the  TDR  procedure  referred  to  in  Section 
ll-A  and  described  in  more  detail  elsewhere  [8],  [21].  Thus, 
formation  of  a  distinct  sinogram  is  not  only  needed  for  rep¬ 
resenting  the  target  but  it  also  an  essential  step  for  remov¬ 
ing  the  unknown  range  to  the  phase  center  of  the  target  and 
the  removal  of  undesirable  effects  associated  with  migra¬ 
tion  of  its  phase  center  with  aspect.  The  crispness  with 
which  oneor  more  sinusoidal  traces  appear  in  the  sinogram 
in  this  alignment  process  can  serve  as  a  measure  of  how  well 
the  unknown  range  to  a  common  reference  point  (center 
of  rotation  or  phase  center  of  the  target)  can  be  compen¬ 
sated  for  in  the  different  aspect  looks  at  the  target.  Quan¬ 
tization  and  thresholding  of  the  sinogram  pattern  of  Fig.  5 
into  a  grid  of  N  x  S  binary  pixels  yields  the  sinogram  rep¬ 
resentation  bH  of  the  target  that  is  suitable  for  the  associ¬ 
ative  storage  and  recall  process  described  in  Section  lll-A. 
In  the  top  row  of  Fig.  6  are  shown  the  sinogram  represen- 
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Fig.  6.  Hetero-associative  storage.  Sinogram  representa¬ 
tions  (top)  and  associated  word  labels  (bottom)  of  three 
aerospace  test  objects. 

tations  of  scale  models  of  three  aerospace  test  objects 
(B-52,  AWAC,  and  Space  Shuttle)  interpolated  and  digitized 
onto  a  grid  of  32  x  32  binary  pixels.  These  are  treated  as  a 
learning  set  and  stored  hetero-associatively  rather  than 
auto-associatively  by  replacing  v'„T’  in  0)  by  r‘t7',  where  k, 
/  =  1,2.  •••,32;m  =  1,2,3,  and  where  r1"'  are  bipolar  binary 
versions  of  the  abbreviated  word  labels  shown  in  the  bot¬ 
tom  row  of  Fig.  6  with  which  the  three  test  objects  are  to 
be  associated.  In  this  fashion,  a  nonsymmetric  4-D  hetero- 
associative  memory  or  connectivity  matrix  T,lkt  is  formed  in 
which  the  associations  between  the  three  sinogram  rep¬ 
resentations  and  their  word  labels  are  embedded.  The  con¬ 
nectivity  matrix  is  used  in  the  numerical  simulations 
described  next. 
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stored  sinogram  representations  were  used  to  initialize  the 
network.  The  partial  versions  of  the  stored  entities  ranged 
down  to  a  fraction  17  =  10  percent  of  the  full  representation. 
Here,  is  the  ratio  of  the  number  of  range-profiles  entered 
to  the  total  number  of  range  profiles  used  to  characterize 
the  object  in  its  sinogram  representation.  Reliable  identi¬ 
fication  of  the  partial  sinogram  input  was  found  to  occur 
after  one  iteration  for  all  entities  stored  down  to  ij  =  0.2 
confirming  convergence  to  a  stable  state  even  wnen  the  Tl/t, 
matrix  is  not  symmetrical,  as  is  usually  required  for  con¬ 
vergence  [31].  This,  and  the  observed  speed  of  conver¬ 
gence,  may  indicate  a  difference  in  the  behavior  of  2-D  nets 
and  their  1-D  counterparts  whose  reason  is  yet  to  be  deter¬ 
mined.  For  7j  =  0.1  or  less,  successful  recall  of  correct  labels 
was  found  to  depend  on  the  angular  location  of  the  partial 
data  with  which  the  memory  is  presented.  In  most  cases  of 
?!  =  0.1,  the  net  labeled  the  partial  initializing  input  cor¬ 
rectly,  as  illustrated  in  the  AWAC  example  of  Fig.  7,  and  in 
those  cases  when  it  did  not  do  so,  it  produced  a  garbled 
and/or  contrast-reversed  version  of  a  label  that  resembled 
one  of  the  other  labels  (see  the  B-52  examples  in  Fig.  7). 
Below  7)  =  0.1,  the  reliability  of  recall  deteriorates  rapidly. 
However,  in  nearly  all  simulations  with  partial  input,  failure 
of  the  net  to  label  the  entry  correctly  was  manifested  by 
convergence  onto  a  garbled  and/or  contrast-reversed  ver¬ 
sion  of  one  of  the  identifying  labels.  This  behavior  could 
be  usefully  interpreted  as  the  net  indicating  it  has  insuf¬ 
ficient  information  and  that  more  information  is  needed 
before  a  decision  (identification)  can  be  made  and  that  oth¬ 
erwise  no  decision  should  be  made. 

Rapid  "one-shot"  convergence  to  correct  association 
exhibited  above  even  with  small  values  of  7 j  means,  in  the 
language  of  dynamical  systems,  that  the  fixed  point-attrac¬ 
tors  (stored  associations)  in  the  phase-space  of  the  net  are 
Strong  and  they  possess  large  basins  of  attractions. 

The  results  above  illustrate  the  potential  of  neuro- 
morphic  processing  in  object  identification  from  partial  sin¬ 
ogram  information  (object  representation).  What  is  note¬ 
worthy  is  that  the  net  in  those  simulations  performed  the 
functions  of  storage,  processing,  and  labeling  simulta¬ 
neously,  which  is  the  hallmark  of  distributed  collective  pro¬ 
cessing.  The  performance  of  such  nets  is  also  known  from 
other  work  to  be  robust  and  fault  tolerant.  I  n  an  actual  hard¬ 
ware  implementation  of  a  prototype  neural  net  of  32  neu¬ 
rons,  correct  associative  recall  from  partial  information 
continued  to  take  place  even  when  nearly  20  percent  of  the 
neurons  were  disabled  [10].  The  binary  nature  of  the  sin¬ 
ogram  representation  resulting  from  interpolation  and 
thresholding  the  raw  sinogram  data  (e.g.,  of  Fig.  5(a))  is 
expected  to  impart  to  it  some  immunity  to  noise  present 
in  the  measured  data.  To  apply  the  method  in  practice,  sev¬ 
eral  issues  related  to  the  generation  of  sinogram  libraries 
and  to  the  ability  to  determine  the  aspect  angles  for  which 
data  is  collected  must  be  considered.  These  are  briefly 
addressed  in  the  following  section. 


IV.  Resu.ts 

Numerical  simulations  of  interrogating  the  hetero-asso- 
ciativelv  formed  memorv  matrix  with  complete  and  partial 
versions  of  the  three  entities  (sinogram  representations) 
stored  m  it  following  the  procedures  of  Section  lll-A  were 
carried  out.  Complete  and  partial  versions  of  the  three 


V.  Discussion 

Methodologies  of  microwave  diversity  imaging  studied 
extensively  at  the  Electro-Optics  and  Microwave  Optics 
Laboratory  of  the  University  of  Pennsylvania  for  more  than 
two  decades  provide  the  basis  for  a  new  generation  of  3-D 
tomographic  imaging  radars  that  can  furnish  shape  esti- 
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mates  of  the  3-D  distribution  of  scattering  centers  on  remote 
aerospace  targets  with  near-optical  resolution.  Arrays  of 
broad-band  coherent  transmitter-receiver  pairs  employing 
the  TDR  technique  can  be  used  to  access  the  Fourier  space 
of  remote  scattering  targets.  Resolution  in  such  systems 
depends  on  the  angular  and  spectral  windows,  utilized  for 
data  acquisition  and  on  polarization  diversity.  Unprece¬ 
dented  centimeter  resolution  has  been  demonstrated  in 
projection  rather  than  3-D  images  of  scale  models  of  such 
targets  employing  gigahertz  spectral  windows,  wide  angu¬ 
lar  windows  of  tt/ 2  [Rad .],  and  image  enhancement  by  polar¬ 
ization  diversity  and  symmetrization.  Image-reconstruc¬ 
tion  algorithms  based  on  Fourier  inversion  or  by  filtered 
backprojection  are  equally  applicable  and  nave  been  found 
to  yield  comparable  results.  The  use  of  spectral,  angular, 
and  polarization  degrees  of  freedom  in  such  imaging  sys¬ 
tems  has  the  advantage  of  increasing  the  information  con¬ 
tent  of  theobfect-scattered  wavefields.  This  enbles  a  broad¬ 
band,  polarization-selective  array  aperture  to  acquire  more 
information  about  a  scattering  ob|ect  than  it  could  have 
monochromatically  tat  a  single  frequency)  or  at  a  single 
polarization.  A  useful  tradeoff  between  spectral  and  angu¬ 
lar  degrees  of  freedom  exists.  It  enables  considerable  thin¬ 
ning  of  the  imaging  array.  Because  angular  degrees  of  free¬ 
dom  are  associated  with  the  number  of  elements  or  stations 
in  the  array,  their  replacement  with  less  costly  spectral 
degrees  of  freedom,  associated  with  the  number  of  fre¬ 
quency  points  used  in  data  acquisition,  can  cut  cost  and 
lead  to  significant  improvement  in  cost  effectiveness. 

Despite  these  attractive  attributes  of  microwave  diversity 
imaging  systems,  there  are  circumstances  when  the  base¬ 
line  (physical  or  synthetic)  required  to  realize  the  wide 
angular  windows  needed  to  achieve  high  resolution  is  not 
available  or  is  not  sufficient  to  form  a  recognizable  image. 
One  has  to  rely  then  on  means  of  target  identification  other 
than  image  formation  and  analysis  by  theeye-brain  system. 

The  "neuromorphic"  or  "brain-like"  processing 
approach  to  super-resolved,  robust,  and  fault  tolerant  rec¬ 
ognition  described  in  the  preceding  section  is  not  only 
intellectually  attractive,  providing  for  the  first  time  a  con¬ 
nection  between  neural  nets  and  applied  electromagnetics, 
but  could  also  obviate  the  need  for  large  expensive  imaging 
array  systems  (of  the  type  needed  in  microwave  diversity 
maging  systems  and  other  more  conventional  approaches 
to  radar  target  imaging)  and  can  avoid  the  time  spent  for 
aperture  synthesis  for  example  by  target  motion  in  ISAR 
imaging.  The  implication  of  this  for  microwave  (and  other) 
automated  object-identification  systems  can  be  far  reach¬ 
ing  and  is  sufficient  motivation  to  search  for  a  new  gen¬ 
eration  of  automated  neuromorphic  radar  and  sonar  rec¬ 
ognition  systems,  that  can  identify  remote  targets  from  only 
a  few  looks  [27].  Many  of  the  findings  of  the  work  reported 
here  also  carry  over  to  the  domain  of  machine  vision  and 
recognition  for  robotic  applications.  The  problem  then  is 
however  more  complex  because  objects  of  interest  are  not 
found  in  perfect  isolation  as  is  the  case  in  recognizing 
aerospace  targets  In  the  radar-targer  identification  sce¬ 
nario.  suitable  target  representations  (signatures  or  feature 
spaces)  such  as  the  sinogram  representation  described 
above  would  be  generated  cost  effectively  from  scale 
models  of  targets  of  interest  in  a  controlled  anechoic  cham 
ber  environment  employing  measurement  systems,  ot  the 
tvpe  we  have  described  The  representations  would  be 


"taught"  to  an  associative  memory  or  a  neural  network  that 
can  be  used  to  recognize  partial  sinogram  representations 
of  actual  targets  collected  by  actual  broad-band  coherent 
radar  systems.  Realization  of  this  scenario  entails  careful 
consideration  to  scaling  issues  and  to  the  principles  of 
'electromagnetic  similitude"  [28]  in  order  to  ensure  that 
the  sinogram  representations  collected  using  scale  models 
in  an  anechoic  chamber  RCS  measurement  facility  resem¬ 
ble  as  closely  as  possible  those  of  the  actual  targets.  This 
and  other  issues  such  as  "fluctuations"  of  echos  from  actual 
airborne  targets  because  of  flexing,  deformation,  or  wind- 
buffeting,  the  minimum  number  of  looks  (range-profiles) 
needed  to  represent  an  actual  target  i.e.,  characterize  it  for 
all  practical  encounter  aspects;  the  number  and  size  of 
neural  models  needed  for  the  identification  of  a  given  num¬ 
ber  of  targets,  together  with  the  use  of  sequential  storage 
and  recall,  and  the  self-organization  and  learning  capaci¬ 
ties  of  neural  nets  must  be  addressed  before  the  neuro¬ 
morphic  approach  to  target  identification  can  find  practical 
application.  The  latter  capabilities  have  the  potential  of  pro¬ 
ducing  improved  neuromorphic  target  recognition 
schemes  that  can  learn  the  underlying  structure  of  the  asso¬ 
ciations  presented  to  them  with  generalization  (i.e..  non¬ 
rote  learning)  [30].  These  issues  and  others  are  currently 
under  investigation  [32],  The  ultimate  aim  of  this  work  is  to 
achieve  reliable  distortion  independent  recognition  from 
one  look.  In  this  regard,  we  offer  the  following  final  remarks. 
Because  for  fixed  spectral  window,  the  range  profile  of  a 
target  is  basically  independent  ot  range  and  depends  only 
on  target  aspect,  the  prospect  of  achieving  recognition  from 
a  single  look  (single  range-profile)  would  mean  complete 
distortion-independent  identification,  that  is,  recognition 
independent  of  target  range  or  aspect.  How  can  this  be 
done?  One  can  conceive  of  the  following  approach  or  sce¬ 
nario  that  is  being  considered  in  our  work,  [33],  as  a  direct 
extension  of  the  ideas  given  in  this  paper.  In  this  approach, 
one  seeks  neural-net  structures  and  storage  recipes  that 
can  produce  prescribed  controlled  periodic  attractors. 
Periodic  attractors  are  represented  by  closed  trajectories 
in  phase-space.  Thus,  we  envision  a  net  in  which  we  can 
specify  and  obtain  the  next  state  of  the  net  given  the  present 
state  in  a  closed  or  open  sequence  of  states  to  enable  stor¬ 
age  and  recall  of  prescribed  sequences  of  state  vectors 
instead  of  the  "fixed  point"  phase-space  attractors  encoun¬ 
tered  in  the  above  hetero-associative  storage  and  recall 
work.  Each  periodic  attractor  in  the  envisioned  net  would 
consist  of  a  sequence  of  state  vectors,  representing,  for 
e  'ample,  thresholded  versions  of  angularly  adjacent  range- 
profiles  of  a  target,  with  each  sequence  containing  an  extra 
lable  vector  inserted  to  identify  the  target  associated  with 
that  sequence  or  periodic  attractor.  A  periodic  attractor  of 
the  net  associated  with  a  given  target,  would  be  triggered 
when  the  net  is  initiated  by  either  an  initial  state  that  coin¬ 
cides  with  one  of  the  constituent  thresholded  range-pro¬ 
files  of  that  attractor  or  by  an  initial  state  that  is  sufficiently 
close  to  any  one  of  the  constituent  range-profiles  in  the 
Hamming  sense.  No  matter  which  thresholded  range-pro¬ 
file  is  used  to  initiate  it,  the  net  would  eventually  end  up 
cycling  through  the  associated  periodic  attractor  and, 
hence,  through  all  other  associated  range-profiles,  includ¬ 
ing  the  label  state  vector  whose  oc  currence  we  assume  can 
be  isolated  and  used  to  trigger  an  identification  marker  of 
the  target,  thus  identifying  it  from  the  single  available  im- 
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tializing  range-profile  data.  Although  in  its  early  stages  of 
development,  the  above  "phase-space  engineering"  con¬ 
cept  and  possible  scenario  for  automated  target  identifi¬ 
cation  from  a  single  wideband  radio  echo  helps  one  appre¬ 
ciate  of  the  unique  possibilities  and  power  of  the  neural 
paradigm  and  the  collective  nonlinear  dynamical  system 
theory  approach  to  signal  processing. 
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Abstract— A  neural  net  processor  is  described  for  echo  inversions  and  target  shape  estimations  from  incomplete 
frequency  response  data.  The  processor  accomplishes  the  inversion  and  estimation  by  minimizing  an  energy 
function  which  bears  information  about  the  measured  data,  as  well  as  the  relationship  between  the  target  shape 
function  (image)  to  be  reconstructed  and  its  frequency  response.  An  iterative  algorithm  is  developed  for  the 
processor  to  minimize  its  energy  function  to  give  the  desired  image  as  its  neural  state  outputs.  Successful  digital 
reconstructions  with  the  neural  net  processor  using  microwave  radar  imaging  data  are  presented  and  an  opto¬ 
electronic  implementation  of  the  processor  is  described.  Heuristic  extension  to  make  the  processor  more  neu¬ 
romorphic  by  introducing  nonlinearity  is  discussed  and  digital  reconstructions  with  this  extension  are  shown; 
these  reflect  noticeable  improvement  in  image  quality. 

Keywords — Neural  processing.  Radar  imaging.  Recovery  from  partial  information,  Ill-posedness,  Regulari¬ 
zation,  Opto-electronic  architectures. 


I.  INTRODUCTION 

Neural  net  models  and  their  analogs  (Ballard,  1986; 
Hopfield,  1982)  represent  a  new  approach  to  collec¬ 
tive  signal  processing  that  is  robust  and  fault  tolerant 
and  can  be  extremely  fast.  These  properties  stem 
directly  from  well  recognized  information  processing 
capabilities  of  the  brain.  Although  the  brain  is  not 
as  good  in  arithmetic  operations  as  a  digital  com¬ 
puter,  it  is  known  that  when  it  comes  to  operations 
such  as  association,  categorization,  classification, 
feature  extraction,  recognition,  and  optimization,  it 
can  outperform  even  the  most  powerful  up-to-date 
computers.  Collective  information  processing  in  the 
brain  makes  use  of  the  massive  interconnectivity  of 
neurons  (the  decision  making  elements)  of  the  brain 
and  their  ability  to  store  information  as  weights  of 
links  between  them.  The  brain’s  amazing  capabilities 
in  analyzing  sensory  data  along  with  its  complex 
thought  and  intelligent  reasoning  ability  makes  it  an 
intriguing  model  for  smart  sensing  and  automated 
recognition  systems.  An  interesting  aspect  of  the 
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brain’s  ability  to  process  sensory  data  is  the  ease  with 
which  it  solves  computationally  complex  problems, 
associated  for  example  with  vision,  that  are  basically 
inverse  problems  which  are  known  to  be  computa¬ 
tionally  vexing  because  of  their  ill-posedness  (Tik¬ 
honov  &  Arsenin,  1977).  When  processing  sensory 
data,  l.‘e  brain  can  still  perform  its  tasks  successfully 
even  when  the  information  it  is  presented  is  partial 
(or  incomplete)  and  contains  errors.  Based  on  those 
remarkable  information  processing  capabilities  of 
neurons  in  the  brain,  a  neural  net  processor  is  studied 
and  reported  upon  here  in  the  context  of  image  re¬ 
constructions  from  incomplete  data. 

The  problem  of  image  (or  object  function)  recon¬ 
struction  from  limited  frequency  data  arises  in  many 
remote  sensing  applications  including  radar  and 
sonar  imaging.  A  one-dimensional  object  function 
f(r)  of  limited  extent  possesses  a  frequency  response 
(Fourier  transform)  F(p)  that  extends  over  the  entire 
frequency  space.  In  practice,  the  frequency  response 
F(p)  can  only  be  measured  over  a  finite  region  of 
the  frequency  space  (p-space).  The  traditional  ap¬ 
proach  by  Fourier  inversion  of  the  measured  re¬ 
sponse  Fd(p)  yields  an  imperfect  estimate  /  (r)  of  the 
object  function  because  values  of  F(p)  outside  the 
frequency  measurement  window  are  taken  to  be  zero 
which  violates  a  priori  knowledge.  Retrieval  of  f(r) 
from  Fd(p)  is  also  known  to  be  an  ill-posed  problem 
in  the  sense  that  noise  contamination  and  incom- 
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pleteness  of  the  measured  response  F\,{  p)  can  result 
in  large  fluctuations  in  object  function  reconstruc¬ 
tions.  The  neural  net  processor  to  be  discussed  ac¬ 
complishes  the  reconstruction  from  incomplete 
frequency  response  data  by  minimizing  an  energy 
function  which  bears  information  about  the  mea¬ 
sured  data,  the  underlying  (Fourier)  relationship  be¬ 
tween  the  object  function  and  the  measured  data. 
The  energy  function  is  set  up  in  such  a  w'ay  as  to 
agree  with  a  priori  knowledge  and  overcome  the  ill- 
posedness  of  the  problem.  An  iterative  algorithm  for 
the  processor  is  derived  and  its  performance  evalu¬ 
ated  numerically  in  the  reconstruction  of  radar  range 
profiles  from  realistic  data  will  be  presented.  The 
realistic  data  are  collected  in  a  broad-band  micro- 
wave  cross-section  measurement  facility  employing 
microwave  diversity  techniques  in  which  angular, 
spectral,  and  polarization  degrees  of  freedom  are 
combined  to  extract  maximum  information  about  the 
scattering  object.  Therefore  all  the  following  discus¬ 
sion  with  regards  to  imaging  will  be  relevant  to  mi¬ 
crowave  diversity  radar  imaging. 

The  implementation  of  the  neural  net  processor 
can  be  achieved  opto-electronically.  The  massive 
connectivity  and  parallelism  of  the  neural  net  pro¬ 
cessor  can  be  realized  by  optics  while  the  decision 
making  and  any  required  gain  can  be  realized  by 
electronics.  The  detailed  implementation  scheme  of 
the  neural  net  processor  will  be  discussed.  Finally, 
heuristic  extension  of  the  neural  net  processor  to 
include  nonlinear  neural  mapping,  which  makes  it 
more  neuromorphic,  will  be  discussed  and  the  re¬ 
construction  of  microwave  diversity  radar  image 
based  on  this  extension  will  also  be  given. 

II.  BACKGROUND 

In  microwave  diversity  radar  imaging  (Farhat,  Wer¬ 
ner,  &  Chu,  1985a)  as  well  as  in  many  other  radar 
imaging  applications,  such  as  synthetic  aperture  ra¬ 
dar,  etc.,  the  frequency  response  of  an  object  (or 
target)  to  be  imaged  can  be  accessed  only  for  a  lim¬ 
ited  frequency  bam!  and  a  limited  range  of  aspect 
viewing  angles,  because  of  instrument  limitations  and 
other  practical  reasons.  The  data  in  the  microwave 
imaging  system  described  in  Farhat  ct  ai.  (1985a)  are 
collected  over  a  polar  format  or  polar  frequency  grid 
depicted  in  Figure  1.  Here  px  and  p„  are  Cartesian 
coordinates  of  spatial  frequency  space  (p-space);  p, 
and  pi  are  the  start  and  stop  spatial  frequencies, 
respectively,  associated  with  the  start  and  stop  fre¬ 
quencies  u>;  and  u>;  employed  in  gathering  the  fre¬ 
quency  response  data;  0!  is  the  start  viewing  angle 
and  0;  the  stop  viewing  angle,  and  A0  is  the  total 
viewing  angle.  The  purpose  of  microwave  imaging  is 
to  extract  information  such  as  the  size  and  shape, 
about  a  scattering  target  through  microwave  scatter¬ 


ing. 

ing  measurement;  to  achieve  this  goal  in  three  di¬ 
mensions  for  a  3-dimensional  (3-D)  object  in  reality, 
the  frequency  scattering  measurement  has  to  be  car¬ 
ried  out  in  trie  3-D  Fourier  or  frequency  space  of  the 
object.  But  measurement  over  3-D  manifold  in  Four¬ 
ier  space  is  impractical.  On  the  other  hand,  the  2- 
dimensiona!  (2-D)  frequency  space  grid  shown  in  Fig¬ 
ure  1,  which  represents  a  slice  of  the  3-D  frequency 
space,  can  be  easily  accessed  with  practically  feasible 
radar  systems  (Farhat  et  al.,  1985a).  When  inverted, 
this  2-D  frequency  space  measurement  gives  rise  to 
a  2-D  projection  image  of  the  3-D  object  and  with 
sufficiently  wide  angular  windo  (range  of  aspect 
angles)  A0  enough  image  information  for  identifi¬ 
cation  of  the  object  can  be  obtained  as  will  be  seen 
in  the  results  shown  later.  The  2-D  image  can  be 
reconstructed  from  the  frequency  measurement  over 
the  frequency  space  grid  in  Figure  1  by  invoking  the 
filtered  back-projection  theorem  (Farhat,  Ho,  & 
Chang,  1983)  as  follows;  first,  1-dimensional  (1-D) 
inversion  along  the  radial  direction  with  respect  to 
the  variable  p  for  a  given  aspect  angle  0  is  done  to 
obtain  the  so  called  range-profile  which  bears  infor¬ 
mation  about  the  projection  of  the  scattering  centers 
of  the  3-D  object  on  the  line  of  sight  of  the  inter¬ 
rogating  radar  or  a  radar  cross-section  measurement 
system  for  the  given  aspect  angle  0.  Then,  the  2-D 
image  is  reconstructed  by  coherently  summing  the 
filtered  back-projected  value  (Farhat  et  al.,  1983)  of 
every  range  profile  for  all  aspect  angles  taken  in  the 
imaging  process  (see  Farhat  et  al.  (1983)  for  more 
details). 

The  2-D  object  function  to  be  reconstructed 
should  reveal  the  size  and  shape  of  the  scattering 
target.  Such  a  2-P  object  function  can  be  found  un¬ 
der  general  approximations.  There  are  two  approx¬ 
imations  involved  in  high  resolution  microwave 
imaging  work.  One  is  the  physical  optics  approxi¬ 
mation.  which  requires  the  wave-length  of  the  mi¬ 
crowave  used  for  imaging  be  smaller  than  the 
characteristic  size  of  the  object,  and  the  other  is  the 
Born  approximation,  which  ignores  multiple  scatter¬ 
ing  at  the  object  (Farhat  et  al. ,  1985a;  Ruck,  Barrick. 
Stuart,  A  Krichbaum,  1970).  If  the  imaging  fre- 
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qucncy  hand  (spectral  window)  w  G  [co, ,  w;)  is  chosen 
to  satisfy  both  approximations,  it  can  he  rigorously 
shown  (Chan  &  Farhat,  1981)  that  the  frequency 
response  F{p,  0)  (or  F(ui,  0))  corresponds  to  a  real 
object  function  when  the  object  is  perfectly  con¬ 
ducting  or  w  hen  it  is  composed  of  a  lossless  dielectric. 
We  will  concentrate  on  this  kind  of  object;  however, 
the  method  presented  can  be  extended  to  other  kind 
of  objects. 

From  the  2-D  image  reconstruction  procedure 
mentioned  earlier,  it  is  seen  that  the  2-D  image  is 
just  a  projected  summation  of  range  profiles  over  the 
0  direction  and  this  summation  process  is,  of  course, 
well-posed  (Hadamard,  1923;  Tikhonov  &  Arsenin, 
1977)  and  can  be  done  to  the  desired  accuracy. 
Therefore,  if  the  range  profile  for  every  aspect  angle 
can  be  retrieved  correctly,  the  2-D  image  will  be 
reconstructed  satisfactorily.  The  issue  is  then  how  to 
reconstruct  the  range  profile  for  every  individual 
aspect  angle  satisfactorily.  Since  the  range  profile 
reconstruction  for  a  given  aspect  angle  is  a  1-D  prob¬ 
lem,  we  will  use  scalers  r  and  p  to  represent  points 
in  object  domain  and  Fourier  space  or  p-space,  re¬ 
spectively,  and  by  object  function  we  w'ill  be  referring 
one  range  profile,  unless  it  is  otherwise  specified  as 
a  2-D  or  3-D  object  function  in  the  following  analysis. 
The  traditional  approach  to  reconstructing  a  range 
profile  is  to  employ  the  Fourier  inversion  method. 
Fourier  inversion  of  the  measured  frequency  re¬ 
sponse  Fd(p)  yields  an  estimate  of  the  object  func¬ 
tion, 

/('•)  =  J  Fi(p)e'p,dp  =  jP'  F(p,  Q)e'prdp.  (1) 

Here,  the  following  assumption  about  the  measured 
data  has  been  made, 

F(p,  0)  for  p,  <  p  <  p2  and  fixed  0 
0  otherwise 

For  the  Fourier  inversion  method,  it  is  assumed 
that  the  frequency  response  F(p)  outside  the  mea¬ 
surement  band  [p, ,p2]  is  zero  and  consequently  the 
object  function  /  (r)  estimated  from  the  finite  mea¬ 
surement  band  will  generally  be  complex.  Tne  as¬ 
sumption  and  the  result  violate  a  priori  knowledge 
that  an  object  function  of  finite  extent  in  practice 
has  a  frequency  response  of  infinite  extent  and  that 
the  object  function  to  be  reconstructed  is  real.  Fail¬ 
ure  of  Fourier  inversion  in  eqn  (1)  to  satisfy  a  priori 
knowledge  can  be  traced  to  the  ill-posed  nature  of 
the  inverse  problem  (Hadamard,  1923;  Tikhonov  & 
Arsenin,  1977).  In  practice,  the  measured  frequency 
response  is  contaminated  more  or  less  by  noise,  and 
also  the  measured  response  Fd{  p)  is  only  part  of  the 
Fourier  transform  of  the  object  function  f(r)  to  be 
reconstructed  and  Fd(p)  is  incomplete.  Retrieval  of 


f(r)  from  Fd(p)  is  ill-posed  in  the  sense  that  small 
change  in  the  measured  frequency  response  Fd(p) 
can  alter  radically  the  estimate  f(r)  of  the  object 
function  as  a  consequence  of  noise  contamination 
and  incompleteness  of  Fd(p).  This  motivates  the 
study  for  a  new'  method  to  achieve  fast  and  robust 
reconstructions  satisfying  a  priori  knowledge. 

III.  RECONSTRUCTION  BY 
NEURAL  PROCESSOR 

A  neural  net  processor  is  studied  to  solve  the  prob¬ 
lem  of  object  function  reconstruction  from  incom¬ 
plete  frequency  responses.  The  neural  net  processor 
models  the  collective  computational  behavior  of  neu¬ 
rons  in  human  brains  and  is  set  up  to  be  formed  of 
massively  interconnected  “neurons”  with  parallel 
processing  capability.  The  problem  to  be  solved  by 
the  neural  net  processor  is  formulated  in  terms  of 
desired  optima  (usually  minima).  To  compute  solu¬ 
tions  to  the  optimization  problem,  the  connectivities 
(synapses)  of  the  net  form  an  energy  space  to  ap¬ 
propriately  represent  the  optimization  problem  so 
that  the  net  will  rapidly  converge  to  its  energy  min¬ 
ima  corresponding  to  the  minima  of  the  problem 
when  the  bias  input  representing  the  available  in¬ 
formation  is  fed  into  the  net. 

The  reconstruction  of  range  profile  for  an  indi¬ 
vidual  aspect  angle  in  microwave  diversity  imaging 
is  a  1 -dimensional  problem.  For  a  given  aspect  angle, 
since  the  frequency  response  of  the  range  profile  (or 
1-D  object  function)  is  known  for  a  finite  frequency 
band  and  to  reconstruct  the  original  object  function 
from  this  kind  of  knowledge  of  the  frequency  re¬ 
sponse  is  an  ill-posed  problem,  the  energy  function 
for  the  neural  net  is  set  up  by  the  following  consid¬ 
erations: 

1 .  The  Fourier  transform  of  the  reconstructed  object 
function  should  agree  with  the  known  (measured) 
frequency  response  over  the  given  frequency  band 
[pi,  pz]  in  the  “best  way”; 

2.  the  ill-posedness  will  be  remedied  by  using  re¬ 
gularization  (Tikhonov  &  Arsenin,  1977). 

Accordingly,  the  following  function  is  chosen  as  the 
energy  function, 

HU)  =  II Fd{p)  -  F {p) II2  +  art (7)  (2) 

where,  /  denotes  the  object  function  to  be  recon¬ 
structed  and  is  a  state  vector  of  the  net;  Fd(p)  the 
frequency  response  from  measurement  over  [p,,p2]; 
F(p)  the  Fourier  transform  of  /,  and  ||  ||  a  norm 
defined  on  the  frequency  space  if  9  F(p).  It  is  seen 
that  the  first  term  in  eqn  (2)  does  reflect  the  fitness 
of  the  reconstruction  with  the  known  data  in  the 
frequency  domain,  that  is,  the  first  term  vanishes  if 
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Fj(p)  =  F(p).  The  term  R(f)  in  eqn  (2)  is  a  re¬ 
gularization  operator  on  /  to  overcome  the  ill-posed- 
ness  of  tne  problem  and  it  is  chosen  by  considerations 
of  the  object  function  to  be  reconstructed  and  a  priori 
knowledge.  Since  functions  representing  physical  ob¬ 
jects  in  microwave  diversity  imaging  are  usually  con¬ 
tinuous  and  cannot  have  abrupt  discontinuities,  so 
Tikhonov’s  regularization  functions  (Tikhonov  & 
Arsenin,  1977)  and  their  similar  forms  for  maintain¬ 
ing  smoothness  of  the_reconstruction  will  be  used 
and  represented  by  /?(/);  one  of  the  Tikhonov’s  re¬ 
gularization  functions  used  in  our  study  is. 


R(f)  =  f  {f(r)  +  lf'(r)]2)dr.  (3) 

The  constant  a  in  eqn  (2)  is  called  the  regularization 
parameter  to  control  the  trade  off  between  fitness 
(small  a)  and  smoothness  (large  a)  of  reconstruc¬ 
tions. 

When  the  Fourier  transform  F(p)  is  expressed  in 
terms  of  the  object  function  /,  the  energy  function 
H (/)  in  eqn  (2)  will  only  be  a  function  of  the  variable 
/,  since  Fd  is  the  measured  frequency  response  and 
is  known.  An  iterative  algorithm  for  the  neural  net 
processor  is  derived  by  e-  luating  dH(f)ldf(m)  to 
find  the  energy^hange  A H  due  to  the  change  of  the 
mth  sample  of  /  or  the  change  of  the  state  value  of 
the  mth  neuron  in  the  processor.  To  find  the  minima 
of  H(J),  H(f)  is  desired  to  decrease  as  /  changes 
and  accordingly  the  update  neural  net  iterative  equa¬ 
tion  (see  appendix  for  detail  derivations)  for  the 
(y  +  l)th  iteration  in  terms  of  the  yth  iteration  is 
found  as, 

/"♦”(m)  = 

+  X  {  £  2Re[Tmm‘]fin(m')  +  /  -  5 J  (4) 
(m'-0  J 

where  Sm  is  viewed  as  a  regularization  related  adap¬ 
tive  threshold  and  for  the  regularization  function  in 
eqn  (3),  it  is  given  by. 


.5.  -  2a  {/'■>('«)  (ir  +  "  Tr 

-  1)  +  +  l)]j  (5) 

with  A r  being  the  sampling  interval  in  the  object 
space  ‘.i  3  J  and  A,  and  Az  given  constants.  Re[Tmm-\ 
is  the  real  valued  interconnectivity  matrix  that  rep¬ 
resents  the  underlying  Fourier  transform  and  is  given 
by 
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kernel.  The  term  /„,  in  eqn  (4)  is  the  external  input 
related  to  the  measured  data  and  is  given  by 


=  2  Re 


X  Fd(i)Kl 

_|-  1 


and  is  identified  as  the  real  part  of  the  complex  range 
profile  generated  by  Fourier  inversion  of  the  mea¬ 
sured  frequency  response  Fd\  X  in  eqn  (4)  is  defined 
as  the  “gain”  of  the  mth  neuron  in  the  net  and  is 
chosen  as  to  satisfy. 
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in  order  to  make  H(J)  decrease  as  /  changes.  The 
iterative  neural  net  algorithm  in  eqn  (4)  has  been 
digitally  tested  with  realistic  data  Fd  for  scale  models 
of  aerospace  test  objects  collected  In  a  microwave 
imaging  facility  at  the  University  of  Pennsylvania. 
One  of  the  models  used  is  a  B-52  airplane.  The  fre¬ 
quency  response  data  are  collected  over  a  frequency 
window  from  6(GHz)  to  17(GHz)  and  a  90°  angular 
viewing  window  which  extends  from  the  broadside 
to  head-on  of  the  airplane.  Over  the  90°  viewing 
window,  there  are  totally  128  looks  (views)  taken 
and  the  range  profile  for  each  view  is  reconstructed 
from  the  measured  frequency  response  data  using 
the  neural  net  processor  expressed  by  eqn  (4);  the 
2-D  object  function  is  finally  obtained  by  the  back- 
projection  algorithm  which  coherently  sums  the  fil¬ 
tered  back-projected  value  of  /  for  all  the  views  in 
the  proper  angular  orientations  over  a  rectangular 
image  plane  grid  (Farhat  et  al.,  1983).  Shown  in  Fig¬ 
ure  2(a)  is  the  B-52  airplane  model  used;  Figures 
2(b)  and  2(c)  show  the  1-D  range-profiles  recon¬ 
structed  by  the  FFT  algorithm  and  the  neural  net 
processor,  respectively.  Shown  in  Figure  2(b)  is  the 
modulus  of  the  reconstructed  complex  range  profile 
while  shown  in  Figure  2(c)  is  the  intensity  of  the  real 
range  profile.  The  2-D  object  image  reconstructed 
by  the  traditional  FFT  and  filtered  back-projection 
is  shown  in  Figure  3(a)  and  that  obtained  by  the 
neural  net  processor  followed  by  filtered  back-pro¬ 
jection  is  shown  in  Figure  3(b).  Since  the  airplane  is 
illuminated  from  only  one-side  of  the  fuselage,  im¬ 
ages  initially  reconstructed  are  brighter  on  the  illu¬ 
minated  side  of  the  fuselage  and  both  images  shown 
in  Figures  3(a)  and  3(b)  are  the  symmetrically  en¬ 
hanced  images  of  the  initial  reconstructions  about 
the  symmetrical  axis  of  the  airplane — the  fuselage. 
It  is  worth  noting  that  most  man  made  aerospace 
objects  possess  one  or  more  axis  of  symmetry  and 
these  are  usually  determined  from  object  bearing. 
Comparing  the  reconstructions  by  the  two  different 
methods,  the  following  observations  can  be  made. 


with  S’  being  the  total  number  of  measurement  sam-  •  Reconstruction  by  the  neural  net  processor  has 
pies  in  the  frequency  domain  and  K,„  the  Fourier  lower  back-ground  noise  level  which  can  be  helpful 
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FIGURE  2.  (a)  B-52  airplane  model;  (b)  reconstruction  by  FFT;  (c)  reconstruction  by  neural  net  processor. 


in  practical  applications  where  signal  to  noise  ratio 
is  low  and  leads  to  improved  images. 

•  Reconstruction  by  the  neural  net  processor  is  ob¬ 
tained  by  assuming  that  the  object  function  is  real 
and  this  makes  it  easier  for  opto-electronic  imple¬ 
mentations. 

IV.  OPTO-FLECTRONIC  IMPLEMENTATION 
OF  NEURAL  NET  PROCESSOR 

Optical  processing  systems  offer  potential  for  ultra 
fast  speed  and  means  for  realizing  parallel  processing 
as  well  as  massive  interconnections  among  processing 
elements.  Therefore,  optics  can  be  used  for  the  im¬ 
plementations  of  neural  net  models  (Farhat.  Psaltis. 
Prata.  &  Paek.  1985b)  for  computing  and  informa¬ 
tion  processing,  while  the  decision  making  elements 
in  the  implementation  can  be  realized  electronically 
at  present.  The  architecture  for  an  opto-electronic 
implementation  of  the  neural  net  iterative  algorithm 
expressed  in  eqn  (4)  is  shown  in  Figure  4.  The  neural 
state  vector  J  in  the  implementation  is  represented 
by  the  output  of  the  light  emitting  diode  (LED)  ar¬ 
ray.  Although  LED  can  only  represent  positive  func¬ 


tions  w'ith  its  output  intensity,  the  real  valued 
function  f‘n(m)  in  microwave  diversity  imaging  can 
be  handled  by  using  separate  LEDs,  to  code  positive 
and  negative  values  of  The  interactions 

among  neurons  are  provided  by  broadcasting  the 
neural  states,  that  is.  the  outputs  of  the  LEDs, 
through  the  2-D  interconnectivity  matrix  mask 
Re[Tmm  ]  and  the  output  (the  activation  potential)  of 
each  neuron  is  picked  up  by  the  photo-detector  (PD) 
array.  The  term  I„  that  represents  the  measurement 
data  and  the  adaptive  threshold  Sm  that  overcomes 
the  ill-posedness  of  the  problem  can  be  computed 
digitally  and  injected  into  the  system  either  elec¬ 
tronically  or  optically  as  shown,  but  completely  an¬ 
alog  schemes  for  computing  these  terms  are  also 
possible  (an  example  of  analog  generation  of  Sm  is 
given  below).  The  neural  "‘gain  "  A(\  <  1)  can  be 
realized  with  an  optical  or  electronic  attenuation. 
The  resultant  output  from  each  attenuator  would  be 
used  to  drive  the  LED  array  which  w  ill  in  turn  update 
the  neural  state  to  for  the  ( j  -r  ])th  iter¬ 

ation. 

The  regularization  factor  5m  is  adaptively  gener¬ 
ated  in  analog  fashion  according  to  the  neural  state 


FIGURE  3.  (a)  2-D  image  reconstructed  by  FFT;  (b)  2-D  Image  reconstructed  by  neural  net  processor. 
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FIGURE  4.  Opto-electronlc  Implementation. 


in  the  net.  Shown  in  Figure  5  is  an  optical  shuffle 
scheme  for  doing  this.  The  input  to  this  generating 
system  is  the  neural  state  array  fU)(m)  in  Figure  4 
and  the  output  can  be  fed  into  the  neural  net  iterative 
processor  shown  in  Figure  4  either  electronically  or 
optically  (e  g.,  through  the  vertical  LED  array  Sm). 
The  anamorphic  imaging  systems,  L,  and  L?,  are 
used  to  smear  LEA  (light  emitting  array)  output  rep¬ 
resenting  the  neural  states  vertically  over  the  beam 
splitter  cube  (BSC).  Properly  arranged,  the  tilted 
mirrors  TM{  and  TM2  can  reflect  the  light  from 
the  BSC  to  PDA1  to  form  the  term  [/0)(m  -  1)  + 
fU)(m  +  1)],  while  the  mirrors  M,  and  Af2  can  reflect 
the  light  from  BSC  to  PD  A2  to  form  the  term 
The  constants  (1/Ar)  and  [Ar  4-  (2/Ar)]  shown  in 
Figure  4  are  obtained  by  setting  A,  =  A2  =  1  in  Sm 
in  eqn  (5)  for  the  cases  of  0  <  m  <  M  and  they  can 


ftm)  TM1.TM2-  TILTED  MIRRORS 

PDA1,  POA2  •  PHOTO-OETECTOR  ARRAYS 


FIGURE  5.  Optical  shuttle  scheme  for  generating  S„. 


be  multiplied  with  the  proper  terms  through  optical 
(or  electronic)  multipliers. 

V.  HEURISTIC  EXTENSION 

By  examining  the  neural  net  iterative  algorithm  ex¬ 
pressed  in  eqn  (4),  the  overall  activation  potential 
of  the  mth  neuron  in  the  net  can  be  written  as, 

Um  =  2  WmVtv  -  em  +  klm  (7) 

m ' 

where, 

I V  =  f(m') 

Wm.n>  =  6mm'  4-  2XRe[Tmm  ] 
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The  mapping  equation  in  conventional  neural  nets 
is. 

vm  =  g(uj  (8) 

and  g(um)  is  a  sigmoidal  function.  For  the  problem 
of  image  reconstruction,  multivalued  neural  states  i'„ 
have  to  be  used  to  represent  the  bipolar  object  func¬ 
tion.  In  an  attempt  to  make  the  neural  net  processor 
more  neuromorphic.  modification  must  be  made  to 
the  conventional  neural  mapping  equation  expressed 
in  eqn  (8)  to  preserve  the  multivalued  neural  re¬ 
sponses.  Heuristically  therefore  the  nonlinear  map¬ 
ping  will  be  confined  only  to  the  adaptive  threshold 
em  such  that  0„  =  g(Sr)  with  SF  being  a  linear  com¬ 
bination  (window)  of  the  neural  state  vector  v  of  the 
net.  The  term  Im.  which  represents  the  known  in¬ 
formation.  will  not  go  through  the  nonlinear  mapping 
in  order  to  preserve  the  original  available  informa¬ 
tion.  Accordingly,  the  neural  state  mapping  equation 
will  be  of  the  following  form. 

vn  =  I  VV_.iv  -  g(S-)  -  (9) 

As  mentioned  earlier,  the  adaptive  threshold 
represents  the  regularization  factor  in  the  original 
energy  function.  The  introduction  of  nonlinear  adap¬ 
tive  threshold  g(Sr)  will  extend  the  set  of  the  regu¬ 
larization  functions  applicable.  The  adaptive 
threshold  Sm  in  eqn  (5)  can  be  viewed  as  a  linear 
convolution  (or  a_kind  of  linear  mapping)  /  *  x  of 
the  neural  states  /  with  a  filter  x  consisting  of  three 
discrete  points  having  values 

1  -  A. 

x{  -  1 )  =  -  A-  Sr.  ,r(U)  =  Sr - - - . 

Sr 

,r(  1 )  =  -A,  Sr 

respectively,  as  shown  in  Figure  6(a)  when  A,  = 
A;  =  1.  This  linear  mapping  can  be  replaced  with  a 


mm 

1 1 .  II . EH  Ml  1 

(a) 


nonlinear  mapping  of  the  form  C  tanh(V>-).  where 
£  and  t)  are  constant  parameters.  A  preliminary  im¬ 
age  obtained  with  this  nonlinear  adaptive  threshold 
is  given  in  Figure  6(b)  which  reveals  features  of  the 
original  object  that  were  not  discernible  in  the  pre¬ 
ceding  images.  For  example,  the  double  barreled  na¬ 
ture  of  the  engines  is  now  more  clearly  delineated. 

VI.  DISCUSSION 

The  neural  net  processor  can  also  be  used  for  object 
reconstructions  when  the  relationship  between  the 
measurement  data  and  the  object  function  is  not  nec¬ 
essarily  Fourier  transform;  in  this  case,  there  is  no 
fast  and  robust  algorithm  found  so  far  for  reconstruc¬ 
tions.  But  the  neural  net  processor  concept  presented 
here  can  be  easily  applied  just  by  modifying  the  op¬ 
tical  mask  representing  the  underlying  transform  to 
achieve  fast  and  robust  reconstructions.  The  term 
=  2 Re  [Z,'l ,  Fd(i)K2m]  is  actually  the  real  part  of  the 
complex  range-profile  computed  from  measured  fre¬ 
quency  response  data  by  FFT  and  it  can  be  viewed 
as  the  partial  input  "seed"  or  “key"  to  the  neural 
net  as  the  initial  estimate  of  the  real  range  profile  of 
the  object  function  to  be  reconstructed.  The  neural 
net  processor  concept  described  can  be  applied  to  a 
wide  range  of  practical  problems  simply  by  inputing 
the  corresponding  partial  "key"  as  lm  to  the  neural 
net.  Nonlinear  regularization  functions  can  be  intro¬ 
duced  in  the  neural  net  processor  as  described  and 
many  nonlinear  or  linear  regularization  functions 
with  high  degree  of  smoothness  that  is  difficult  to 
realize  rapidly  by  digital  computation  can  be  easily 
realized  optically.  The  regularization  parameter  a. 
which  controls  the  fitness  and  smoothness  of  the  re¬ 
constructions  in  our  research  of  microwave  image 
reconstructions,  can  also  be  adaptively  changed  de¬ 
pendent  on  the  fitness  of  the  reconstruction  with  the 


(b) 


FIGURE  6.  (a)  Linear  mapping  function:  (b)  image  reconstructed  by  nonlinear  mapping. 
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measurement  data  during  the  iterative  process  of  the 
neural  network.  Doing  imaging  adaptively,  and  in¬ 
corporating  attractive  information  processing  fea¬ 
tures  of  neurons,  could  make  processors  of  the  kind 
described  here  unique  and  powerful  to  outperform 
many  existing  processors. 
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APPENDIX:  DERIVATION  OF  THE 
ITERATIVE  EQUATION 

This  appendix  derives  the  iterative  equation  for  the  neural  net 
processor.  The  energy  function  for  the  neural  net  processor  is 
given  in  eqn  (1)  and  is  rewritten  here. 

H(J)  =  || F,  -  F||-  +  aR(J).  (A.  1) 

As  mentioned  in  the  text,  /  denotes  the  objcctfunction  to  be 
reconstructed  and  F  is  the  Fourier  tansform  of  /,  Fd  is  the  fre¬ 
quency  response  from  measurement;  R(J)  is  a  regularization  op¬ 
erator  on  /  and  a  a  regularization  parameter.  In  the  derivation 
here,  we  will  use  the  following  form  as  the  norm  in  the  frequency 
domain, 

II Fa  -  F|l:  =  2  IW  -  f(0l*  (A. 2) 

>«  I 

where  N  is  the  number  of  samples  in  the  frequency  space  and  also 
the  number  of  frequency  measurement  samples.  F,(i)  and  F(i) 
are  the  frequency  response  samples  at  the  ith  frequency  point. 

For  a  spatially  limited  function  /(r)  and  its  Fourier  transform 
Ftp),  there  exists  the  following  relationship. 

Ftp)  =  I  f{r)e""dr  (A. 3) 


where  the  spatially  limited  function  f(r)  is  assumed  as. 


f(r) 


*  0  if  r  E  [a.  6) 
=  0  otherwise 


Usually.  Ft  p)  is  measured  at  equally  spaced  discrete  frequency 
points  in  ( p, ,  />;].  that  is, 

/>.  =  /».  +  (<  -  V-r-J  '  =  1.2 . N  (A. 4) 

.V  -  1 

where  [  p , .  p:]  corresponds  to  the  frequency  band  |tu, ,  <v,|  used  in 
the  measurement. 

Replace  eqn  (A. 3)  with  a  linear  algebraic  equation  for  com¬ 
putation  purposes, 

F(i)  =  Ftp,)  =  2  K„f{m).  (A. 5) 

m-0 

The  approximation  sum  of  Simpson's  rule  for  an  integral  is, 

j  g{x)dx  =>  j  [y0  +  yi  +  2( y}  +  y.  +  —  +  y,  f) 

+  4(y,  +  y,  +  +  yi-,)] 

where,  /  is  an  even  number  and  the  function  g(x)  is  evaluated  at 
point  x,,  which  satisfies  the  relationship  xt  =  a  +  kh  (h  can  be 
a  constant  or  variable),  to  give  y,  =  g(xt),  for  k  =  0.  1,  2,  3, 
...,/-  1 ,  /. 

If  Simpson's  sum  is  used  in  eqn  (A. 5),  we  have, 

f(m)  =  f(a  +  m  •  Sr)  ^Ar  =  ~~j  (A -6) 

Km  =  (A  7) 

where,  the  constant  p  is  given  according  to  Simpson's  rule  as, 

11  if  m  =  0  and  m  =  M 
2  if  m  is  even  and  m  f4  M 
4  if  m  is  odd 

and  M  is  an  even  integer  and  (M  +  1)  is  the  total  number  of 
samples  in  the  object  domain. 

If  the  regularization  function  in  eqn  (3)  in  the  text  is  used  and 
the  rectangular  rule  is  used  for  its  numerical  evaluation,  the  energy 
function  in  eqn  (A.l)  will  be  of  the  following  form. 


H(J)  =  2  I  FA')  ~  F(i)r 


2  {/vo  + 


f(m  -  f(m  -  l)1 
Ar 


=  2  UFA') I1  -  FAi)F'(i)  -  F!(i)F(i)  +  F(,)F’{,)\ 


f(m  -  f{m  -  1) 
A  r 


]>• 


Using  eqn  (A. 5),  we  will  have. 


2  F(i)F’(i)  =  2  f  2  *-F(m)][  2 \  K.-H'"’)] 

=  22  fi  /(«)/•(«') 

m-0  L.-  I  J 


Here,  the  notation 


-  2  2  Fmm  f(m)f(m'). 


T..  =  -  2  K-F- 


has  been  used  and /’(m  )  =  f(m'),  that  is.  f(m')  is  real,  has  been 
assumed 

Treating  the  other  term  accordingly,  we  will  get. 

2  \FU)F:U)  +  F'(r)F.(r)l  =  2  Imf(rn)  (A  10) 
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with  the  notation, 

/„  =  £  [AT,mF;(<)  +  KLFAt)]  =  2 Re  V  FjtOA'ij. 

From  eqn  (A. 5),  it  is  seen  that  the  direct  Fourier  inversion  of  the 
measured  frequency  response  F,(i)  would  yield  an  estimate  of  the 
object  function  given  by  (with  a  scale  constant), 

% 

/(«)  =  X  F,(0KI  (A.  11) 

and  /„  would  just  be  the  real  part  of  this  Fourier  estimate^ 
Using  eqns  (A. 9)  and  (A.  10),  the  energy  function  W(/)  can 
be  rewritten  as, 

HQ)  =  ~  £  i  r„  /(m)/(m') 


X  /-/(«)  +  X  WO  1 1 


«  X  ff 

«*<?  L 


A  Lf<m)r  +  Tr. 

-  1)  +  - 


f:(m  -  1) 
Sr 


Note  that  /(m)  =  0  for  m  <  0  or  m  >  M  has  been  assumed. 
The  first  derivative  of  W(/)  with  respect  to  f(m)  is  found  as, 

TtS  =  -  i  I7-  +  -  /. 


2a  {/(m)  (sr  + 


[AJ(m  -  1)  +  AJ(m  +  1)|  (A  13) 


where,  the  constants  A,  and  A:  are  given  as. 


f  1  if  0  <  m  s  Af 

1  \0  if  m  =  0 

f  1  ifO<m<M 

;  \0  tfm  =  M 


From  Tmm  -  -2;'.,  ,  it  is  seen  that  Tm  m  =  -2,'.,  x 

=  (T_)*,  so, 

T..  +  7__  =  2«e[r_).  (A.  14) 

If  the  higher  order  derivatives  are  ignored,  the  change  AW  of  the 
energy  function  H(J)  due  to  the  change  of  the  mth  neuron's 
output  f(m )  can  be  written  as, 

AW  =  W"*"(7)  -  Hu'(f)  =  ~77~~  Sf(m) 

df(m) 

=  TT7Z7  (/"•"('”)  -  /''  <"»)!  (A.  15) 


We  want  to  find  the  minima  of  W(J ),  so  we  want  //(/ )  to  decrease 
as  f(m )  changes,  that  is, 

AW  =  <  0  <A  lb) 

df(m) 


Therefore,  we  take, 

Sf(m)  -  f  ’  (m)  -  /  (m)  =  -X  <A  12) 

df(m) 

Mere,  the  constant  X  is  chosen  in  such  a  wav  as^to  ensure  descent 
of  the  energy  of  the  net  to  a  minimum  of  //(/)  and  this  will  be 
discussed  later  in  this  appendix 

From  eqns  (A  13)  and  (A  14)  and  using  the  proper  values  for 
the  constants  A,  and  A...  we  will  have, 

-  f'"(rn) 

=  X  |  X  2 RelTm.m  ]f'’\m-)  +  /.  -  5.J  (A. 18) 

with  the  term  Sm  being, 

S„  =  2a  |/''’(m)  ^Ar  +  - 

-  ■^(A,/I'>(m  -  1)  +  AJ"'(m  +  1)]J.  (A. 19) 

ft  is  seen  that  eqn  (A  .  18)  is  the  iterative  equation  for  the  processor 
given  in  eqn  (4)  in  the  text  and  when  0  <  m  <  M.  eqn  (A.  19)  is 
eqn  (5)  in  the  text. 

Now  let  us  determine  the  conditions  that  X  must  satisfy  in 
order  to  find  the  minimum  of  //(/).  When  the  value  of  the  k th 
neuron  is  changed  from  f"'(k)  to  f'"(k )  +  A f(k).  the  energy 
function  change  can  be  written  as. 

W(J)  +  AW, (/)  =  W(7  +  6  ,„/(m)).  (A. 20) 

Referring  to  eqn  (A.  12),  it  is  found, 

AW.cn  =  -  X  [2Re|r,.  ]/(m')  +  /,  -  S.)  Sf{k) 

m  »() 

-  [tu  -  a  (Ar  +  [A/(Jk))T  (A. 21) 

Using  eqns  ( A .  17)  and  (A  18)  to  express  A/(i ),  we  will  obtain. 

AW, (7)  =  -  +  T,.  -  a  ^Ar  +  [A/(*)]!. 

(A. 22) 

Since  [A/(i)]:  is  nonnegative,  if  k  satisfies, 

[  +  Jr,.  -  a  (Ar  +  >0  (A. 23) 

then.  AW, (7)  ^  0,  that  is,  W(7)  will  decrease  and  find  its  minima 
as  f(m)  changes. 

From  eqn  (A. 7)  and  the  notation  =  -2,v.,  we 

can  find, 

r..  =  -X  a:., a.:  =  -X  v  (a.24) 

which  is  a  known  constant  for  the  given  sample  interval  A r  in  the 
object  domain  and  the  total  number  of  samples  W  in  the  frequency 
domain  Therefore,  X  can  be  determined  from  the  known  quan¬ 
tities  according  to  eqn  (A  23). 
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LEARNING  NETWORKS  FOR  EXTRAPOLATION  AND 
RADAR  TARGET  IDENTIFICATION 
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ABSTRACT:  The  problem  of  extrapolation  for  near-perfect  reconstruction  and  target 
identification  from  partial  frequency  response  data  by  neural  networks  is  discussed.  Because 
of  ill-posedness,  the  problem  has  traditionally  been  treated  with  regularization  methods. 
The  relationship  between  regularization  and  the  role  of  hidden  neurons  in  layered  neural 
networks  is  examined,  and  a  layered  nonlinear  adaptive  neural  network  for  performing 
extrapolations  and  reconstructions  with  excellent  robustness  is  set  up.  The  results  are  then 
extended  to  neuromorphic  target  identification  from  a  single  "look"  (single  broad-band 
radar  echo).  A  novel  approach  for  achieving  100T  correct  identification  in  a  learning  net 
with  excellent  robustness  employing  realistic  experimental  data  is  also  given.  The  findings 
reported  could  potentially  obviate  the  need  to  form  radar  images  in  order  to  identifv  targets 
and  could  furnish  a  viable  and  economical  means  for  identifying  non-cooperative  targets. 

1  Introduction 

For  an  object  function  o(r)  of  finite  spatial  extent,  the  corresponding  frequency  response 
Ftp)  extends  over  the  entire  frequency  domain  --x  <  p< -(-oc.  Because  of  practical  con¬ 
straints,  the  frequency  response  F(p)  can  only  be  measured  over  a  finite  frequency  window 
Pi<p<P2  to  give  the  measured  frequency  response  Fm(p).  The  traditional  and  widely 
used  approach  of  Fourier  inversion,  by  means  of  a  discrete  Fourier  transform  (  DFT).  as  an 
algorithm  for  retrieving  o(r)  from  Fm(p)  violates  a  priori  knowledge  of  the  object  function 
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and  yields  an  estimate  of  o[r)  with  limited  resolution,  which  may  not  satisfy  the  resolution 
requirements  in  demanding  applications. 

More  sophisticated  methods  for  retrieving  a  better  estimate  of  o(r)  from  Fm(p)  exist. 
The  retrievaj  of  o(r)  from  the  partial  information  Fm(p)  in  the  presence  of  noise  is,  however, 
known  to  be  an  ill-posed  problem  [l],[2j.  Studies  have  been  carried  out  for  retrieving  o(r) 
by  incorporating  a  priori  knowledge  and  minimizing  a  certain  “cost  function"  related  to 
Fm(p)  subject  to  a  given  criterion  [3].  Mathematically,  the  function  to  be  minimized  can 
be  put  into  the  following  general  form: 

H(o)=\\Fm-  F\\2  +  aR(o)  (1) 

where  Fm  is  the  measured  frequency  response:  F  is  the  Fourier  transform  of  the  estimate 
function  o(r);  i?(o),  called  the  regularization  function,  is  to  ensure  that  the  reconstructed 
o(r)  has  certain  smoothness  properties:  and  a.  called  regularization  parameter,  adjusts  the 
degree  of  fitness  expressed  in  the  first  term  on  the  right  hand  side  of  (1)  relative  to  the 
degree  of  regularization  or  smoothness  expressed  in  the  second  term.  For  example,  the 
function  R(o)  in  Tikhonov’s  regularization  method  [1]  is  taken  to  be  a  sum  of  the  squared 
derivatives  of  o(r), 

Rt(o)  =  !>(<:)(r)]2  (2) 

k 

to  ensure  that  o(r)  has  the  required  degree  of  smoothness.  Here  represents  the  kih 
derivative  of  the  function  o(r). 

There  are  limitations,  however,  to  all  existing  reconstruction  algorithms:  either  an  al¬ 
gorithm  works  well  only  for  a  certain  class  of  object  functions  or  the  a  prion  knowledge 
requirement  is  too  stringent  to  be  satisfied.  The  maximum  entropy  algorithm  [4],  which 
works  well  for  point-like  object  functions,  can  be  placed  into  the  former  class,  while  the 
Papoulis-Gerchberg's  algorithm  [5j.[6],  which  requires  knowledge  of  the  exact  extent  of  ob¬ 
jects.  can  be  placed  into  the  latter  class.  By  inspecting  equation  (1).  one  appreciates  that 
reconstructions  will  be  dependent  upon  the  regularization  function  R(o)  chosen  and  that  a 
given  R\o)  will  only  ensure  certain  regularization  (or  smoothing)  properties  for  the  object 
function  of  r).  This  is  the  reason  why  different  algorithms  with  different  R{o)  work  well  only 
for  a  certain  set  of  object  functions.  For  example,  the  maximum  entropy  algorithm  works 
well  for  point-like  object  functions  and  Tikhonov's  regularization  is  good  for  continuous 
object  functions.  This  represents  one  difficulty  in  choosing  the  regularization  function  Rio) 


in  setting  up  the  cost  function  H(o)  in  (1).  Another  difficulty  is  in  choosing  the  regular¬ 
ization  parameter  a  for  a  given  reconstruction  problem.  For  practical  reconstructions  from 
noise  contaminated  data,  the  parameter  a  can  be  chosen  mathematically  depending  on  the 
signal-to-noise  ratio  in  the  data.  This  in  turn  introduces  the  added  problem  of  having  to 
estimate  the  signal-to-noise  ratio,  which  in  practice  is  difficult  to  do. 

Neural  net  models  offer  a  new  dynamic  approach  to  collective  nonlinear  signal  process¬ 
ing  that  is  robust  and  fault-tolerant  and  can  be  extremely  fast  when  parallel  processing 
techniques  are  utilized  [3] ,[7],  Neural  net  models  provide  a  new  way  of  looking  at  signal 
processing  problems  and  can  offer  novel  solutions.  A  neural  net  processor  for  solving  image 
reconstruction  problems  through  the  minimization  of  an  energy  function  of  the  type  given 
in  (1)  has  been  studied  earlier  [3].  Here,  a  neural  net  approach  to  the  problem  involving 
self-organization  and  learning  is  investigated.  By  making  use  of  the  neural  paradigm  albeit 
in  a  highly  simplified  and  loose  sense,  our  nets  allow  for  complex  neurons  and  complex 
interconnection  weights,  in  addition  to  the  more  biologically  plausible  real  neurons  and  real 
interconnects.  An  adaptive  three-layer  neural  r.et  will  be  used  to  solve  image  reconstruction 
problems.  Learning  is  carried  out  in  the  net  to  change  the  interconnections  between  neurons 
in  different  layers  by  using  the  error  back-propagation  algorithm  [8]-[ll], 

The  analogy  and  relationship  between  the  role  played  by  hidden  neurons  and  that  played 
by  regularization  functions  in  neuromorphic  solutions  of  the  image  reconstruction  problem  in 
( 1)  will  be  discussed.  It  will  be  shown  that  hidden  neurons  play  a  certain  regularization  role, 
and  that  regularization  functions  in  neuromorphic  processors  can  be  realized  with  hidden 
neurons.  In  the  approach  presented,  learning  enables  the  neural  net  to  form  automatically 
the  regularization  function  R{o)  and  the  regularization  parameter  a,  and  to  carry  out  near- 
perfect  reconstructions  adaptively  and  with  excellent  robustness. 

The  near-perfect  reconstruction  results  motivate  further  study  of  object  recognitions 
with  label  representations.  A  three-layer  nonlinear  net  will  be  discussed  for  practical  radar 
target  identification.  A  novel  approach  to  achieve  perfect  (100%  correct)  identification  of 
three  test  targets  utilizing  realistic  data  collected  in  an  anechoic  chamber  using  scale  models 
of  actual  targets  will  also  be  presented.  The  findings  support  and  demonstrate  the  viability 
of  the  neuromorphic  automated  target  identification  first  proposed  by  Farhat  et  al. [16]  as  a 
replacement  to  the  '  raditional.  but  considerably  less  economical,  approach  of  radar  imaging. 


2  Problem  Formulation 


For  a  spatially  limited  object  function  o(r)  and  its  Fourier  transform  F{p),  there  exist  the 
following  well-known  relationships: 


F(P) 

o(r) 


/+0O 

o(r)e~:prdr 

*OC 

i  r+°° 

r,L F[pyeTip 


(3) 

(4) 


where  the  spatially  limited  o(r)  satisfies. 


if  r  6  [n,r2] 
otherwise 


(5) 


The  spatial  frequency  variable  p  has  the  dimension  of  inverse  length  [m-1].  The  spatial 
frequency  band  corresponding  to  the  frequency  band  [uq,u;2]  used  for  measurement  will 
be  denoted  as  [pj.p^.  When  the  frequency  response  F{p)  is  measured  at  equally  spaced 
discrete  frequency  points  over  the  measurement  band  [pi ,  J>2] 5  that  is,  at  the  frequency 
points, 


P*  =  Pi  +  (fc  -  l)Ap  k  =  1,2,  •  •  • ,  iV 


(6) 


where  A’  is  the  total  number  of  measurements  taken  and  Ap  =  (p?  -  p\)/{N  —  1),  the 
estimate  of  the  object  function  by  the  discrete  Fourier  transform  (DFT)  algorithm  (the 
discrete  form  of  (-1))  can  be  expressed  as. 

o(i)  =  o(  r, )  =  ^]TmwPkr- 

Zrr  k 

k 

i  =  1.2,  •  •  ■ ,  .1/  (7) 


where  Ar  =  ( r2  ~  rl)/{M  -  1)  is  the  object  function  sampling  interval  and  M  the  total 
number  of  samples  in  the  object  domain.  The  resolution  of  the  DFT  estimation  is  known  to 
be  proportional  to  2~/(p2  -  Pi)  and  is  insufficient  for  discerning  object  detail  with  spacing 
finer  than  2 rr /(p2  -  pi).  Several  methods  for  oxreoding  this  resolution  r  '*  and  achieving 
super-resolution  have  been  studied  in  the  past  [4]-[6],  but  these  methods  suffer  from  certain 
limitations,  as  noted  in  the  introduction.  Reconstructing  microwave  radar  images  from  data 
processed  by  minimizing  an  energy  function  of  the  form  given  in  (1)  through  neuromorphic 
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processing  has  previously  been  considered  [3].  Results  of  our  continuing  work  on  the  re¬ 
lationship  between  the  role  of  hidden  neurons  and  regularization  functions  discussed  in  [3] 
are  presented  in  the  next  two  sections. 


3  Neuromorphic  Image  Reconstruction 


In  this  section  we  present  a  brief  review  of  radar  image  reconstruction  by  neuromorphic 
processing  [3]  in  order  to  lay  the  foundation  for  our  subsequent  discussion  of  the  relation 
between  the  role  of  hidden  neurons  in  layered  nets  and  regularization  functions.  The  func¬ 
tion  to  be  minimized  in  microwave  radar  imaging  by  neural  net  processing  [3]  has  the  same 
form  as  that  in  (1), 

H(o)  =  \\Fm-  F\\2  +  aR(o)  (8) 


All  quantities  in  (8)  are  the  same  as  defined  earlier.  The  norm  defined  in  the  complex  space 
C  is  of  the  following  form: 

N 

ll^-^l|2  =  El^(l')-F(,)l2  (9) 

«=1 

When  the  Fourier  transform  F  is  expressed  in  terms  of  the  object  function  o(r),  the  energy 
function  H(o)  in  (8)  will  only  be  a  function  of  the  variable  o(r),  since  Fm  is  the  known 
measured  frequency  response.  After  some  manipulations  and  by  assuming  that  the  object 
function  to  be  reconstructed  in  microwave  radar  imaging  is  real  (see  [3]),  the  following  state 
update  equation  for  the  neuromorphic  processor  can  be  obtained: 


o(j  +  1)(Jt) 
A  o(k) 


+  Ao(fc)  +  XIk 


A 


M 

2  ^2  ??[Tfc,]o(j)(i)  —  Sk 
.1=1 


0  <  k  <  M 


(101 

(11) 


where  o^J^(k)  represents  the  state  of  the  kth  neuron  at  the  jth  iteration;  A  is  defined  as  the 
gain  of  the  kth  neuron:  and  Tk,  is  a  quantity  which  bears  information  about  the  transfor¬ 
mation  (here  the  Fourier  transform)  from  the  space  Qdo  to  the  space  QdF.  The  term  Ik 
represents  the  available  information  Fm,  given  by, 


r  * 


h  =  2ft 


Y.FM  A7fc 


1.1=1 


(12) 


where  A’,*  =  c  •  eJP,r*  is  the  Fourier  kernel  and  c  is  a  constant.  Equation  (12)  is  identified  as 
the  real  part  of  the  complex  object  function  generated  by  Fourier  inversion  of  the  measured 
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frequency  response  Fm.  The  term  5*  in  (11)  is  viewed  as  a  regularization-related  adaptive 
threshold,  given  by  the  following  expression: 

Sk  =  2a[/UA:0(j,(&)  +  -  1)  +  >4(*+i)0(j)(fc)]  (13) 

where  Akki  an(l  3X6  constants  [3],  for  a  stabilizing  (regularizing)  function 

of  the  following  form  in  Tikhonov’s  regularization  method: 


(14) 


(15) 


The  neural  net  update  transformation  as  expressed  in  (10)  is  carried  out  iteratively  until 
the  global  minimum  of  the  energy  function  of  (8)  is  reached. 

Microwave  radar  images  reconstructed  using  the  neural  net  processor  described  in  (10) 
showed  improvement  over  images  reconstructed  by  DFT  algorithm,  when  Tikhonov’s  stabi¬ 
lizing  function  ( 14)  or  an  adaptive  threshold  linearly  related  to  the  neural  states  as  expressed 
in  (13)  was  used  [3].  In  conventional  neural  nets,  binary  neurons  and  nonlinear  mapping 
of  neural  states  are  used  [7] ;  this  is  largely  responsible  for  the  robust  and  fault  tolerant 
collective  signal  processing  properties  of  neural  nets.  The  neural  state  update  equation  in 
(10)  is  a  linear  iterative  equation  when  the  threshold  of  linear  mapping  of  neural  states 
given  in  ( 13)  is  used:  in  this  case,  the  advantage  exploited  in  a  neural  net  using  (10)  to  solve 
the  problem  in  (8)  is  only  the  parallel  processing  capability  of  the  neural  net.  No  use  is 
made  of  nonlinear  mapping.  For  the  problem  of  image  reconstruction  in  (8),  multi-valued 
(analog)  neural  states  have  to  be  used  to  represent  the  bipolar  object  function.  Therefore, 
in  order  to  make  the  neural  net  processor  in  (10)  more  neuromorphic,  nonlinear  mapping 
can  be  introduced  only  via  the  adaptive  threshold  5*.  A  nonlinear  function  of  the  form. 


<7(50)  =  tanh(50) 


(16) 


similar  to  the  sigmoidal  function  widely  used  in  conventional  neural  nets  [7j,  [8]  was  in¬ 
troduced  heuristically  and  employed  for  the  adaptive  threshold,  with  Sa  being  a  linear 
combination  of  the  neural  states  [3].  The  adaptive  threshold  Sk  in  (13)  is  a  linear  combina¬ 
tion  of  the  three  nearest  states  only  and  S0  in  (16)  denotes  a  linear  combination  of  possibly 
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many  states  in  general.  The  neural  state  update  equation  in  (10)  can  then  be  rewritten  as, 
o(j+1)(Jb)  =  o^(k)  +  Ao(k)  + \Ik  0  <  k  <  M  (17) 

M 

2£K[rtl]oW(i)-5(S0 

1=1 

The  neural  net  processor  in  (17)  v  as  used  to  reconstruct  one-dimensional  functions 
(range- profiles)  from  measured  frequency  response  data  Fm  for  a  sufficiently  wide  range 
of  aspect  angles  of  a  scaled  model  of  an  aerospace  test  object.  A  two-dimensional  object 
function  representing  a  projection  image  of  the  test  object  was  formed  by  coherently  sum¬ 
ming  the  back-projections  of  the  one-dimensional  range-profiles  based  on  the  projection-slice 
theorem  [3],  [12]. 

The  scale  model  used  in  this  study  is  that  of  a  B-52  airplane.  Realistic  frequency  response 
data  Fm  for  the  object  were  gathered  for  a  range  of  aspect  angles  in  an  anechoic  chamber 
microwave  scatter  measurement  facility  for  two  different  frequency  bands:  one  extending 
from  6  GHz  to  17  GHz  and  the  other  from  2  GHz  to  26.5  GHz.  Images  reconstructed 
from  the  two  frequency  bands  by  DFT  inversion  and  back- projection  are  shown  in  Figs. 
1(a)  and  (b),  respectively.  The  image  in  Fig.  1(b)  from  the  wider  frequency  band  of  2 
GHz  to  26.5  GHz  has  a  higher  resolution  than  the  image  in  Fig.  1(b)  from  the  narrow 
frequency  band  of  6  GHz  to  17  GHz,  as  would  theoretically  be  expected.  It  clearly  shows 
the  double-barreled  nature  of  the  engines,  a  detail  which  is  not  clearly  delineated  in  the 
image  in  Fig.  1(a).  The  image  reconstructed  from  frequency  response  data  acquired  over 
the  narrower  band  (6  GHz  to  17  GHz)  using  the  neural  net  processor  expressed  in  (17)  with 
the  nonlinear  threshold  mapping  function  given  in  (16)  is  shown  in  Fig.  1(c);  this  image 
has  nearly  the  same  resolution  as  the  image  reconstructed  over  the  band  from  2  GHz  to 
26.5  GHz  and  the  double-barreled  nature  of  engine  is  once  again  clearly  delineated.  The 
image  quality  obtained  using  the  neural  net  processor  expressed  in  (10)  with  the  linear 
threshold  mapping  function  is  inferior  to  that  in  Fig.  1(c),  indicating  the  importance  of 
incorporating  nonlinearity  [3].  These  results  demonstrate  the  high  resolution  capability  of 
nonlinear  neural  nets  in  image  reconstructions. 


Ao(k)  =  A 


4  Relationship  Between  the  Role  of  Hidden  Neurons  and 
Regularization  Functions 

The  neural  net  processor  expressed  in  (17)  is  basically  of  the  Hopfield  variety  [7],  It  works 
iteratively  until  a  stable  state  of  the  net  is  reached  to  give  a  solution  for  the  image  recon¬ 
struction  problem  of  (8).  The  iterative  process  can  be  implemented  by  a  parallel  feedback 
loop  [3]  in  which  the  net’s  new  state  is  obtained  by  the  feedback  of  the  state  change  A o(k) 
computed  from  the  neural  state  for  the  preceding  iteration  (see  schematic  Fig.  2(a)).  The 
computation  of  A o(k)  can  be  implemented  by  a  subnet  with  one  hidden  layer  of  neurons  as 
shown  in  Fig.  2(b).  By  comparing  (17)  with  Fig.  2(b)  it  can  be  noted  that  the  hidden  layer 
neurons  implement  the  nonlinear  adaptive  threshold  related  to  the  regularization  function. 
Thus,  the  weights  (or  synaptic  connections)  used  for  the  adaptive  threshold  can  be  com¬ 
bined  with  other  weights  that  directly  connect  the  input  layer  with  the  output  layer,  if  the 
adaptive  threshold  is  a  linear  mapping  of  neural  states  like  that  shown  in  (13).  In  this  case, 
the  neural  net  update  equation  (10)  can  be  rewritten  as: 

o<J  +  1)(fc)  =  ob)(k)  +  &o(k)  +  \Ik  0  <  k  <  M  (19) 

M 

Ao(fc)  =  2A  ~  afakAki  ~  <*f>(k-i)iAki  ~  o(j)(0  (20) 

i=i 

where  6,  is  the  Dirac  delta  function.  On  the  other  hand,  the  total  connections  imple¬ 
mented  from  the  input  layer  through  the  hidden  layer  to  the  output  layer  in  Fig.  2(b) 
can  not  be  combined  with  other  direct  connection  weights  from  the  input  to  the  output 
layers.  This  demonstrates  the  necessity  of  implementing  an  adaptive  threshold  representing 
a  regularization  function  in  nonlinear  neural  nets  with  a  hidden  neural  layer. 

The  relationship  between  the  role  of  hidden  neurons  and  regularization  functions  can 
also  be  appreciated  by  examining  the  regularization  role  played  by  hidden  neurons.  Hidden 
neurons  are  used  to  generate  internal  representations  in  neural  networks  and  to  extend  the 
computational  (or  mapping)  power  of  simple  two-layer  associative  networks  [8],  In  simple 
two-laver  associative  networks,  input  patterns  at  the  input  layer  are  directly  transformed 
(or  mapped),  through  the  synaptic  connections  between  neurons,  into  output  patterns  at 
the  output  layer.  No  internal  representations  by  hidden  neurons  are  involved  in  such  a 
network.  Because  of  this  direct  mapping  property,  simple  networks  will  transform  input 
patterns  of  similar  structure  into  output  patterns  of  similar  structure:  consequently,  such 
network  will  not  b^  able  to  yield  mapping  outputs  that  are  quite  different  when  the  inputs 


input  pattern 

output  pattern 

00 

0 

01 

1 

10 

1 

11 

0 

Table  T.  XOR  Mapping 


are  quite  similar  (or  vice  versa).  A  classic  example  of  this  situation,  that  has  been  discussed 
by  other  researchers  [8],  is  the  exclusive-or  (XOR)  problem  illustrated  in  Table  1. 

In  this  example,  the  inputs  (for  example,  00  and  11),  which  are  quite  different,  are  to 
be  mapped  into  the  same  output  (for  example,  0).  If  two  neurons  in  the  input  layer  are 
used  to  represent  the  two  input  bits  and  one  neuron  in  the  output  layer  is  used  to  represent 
one  output  bit  in  a  simple  two-layer  network,  it  is  impossible  to  find  a  set  of  weights  and 
thresholds  for  all  the  neurons  that  would  perform  the  desired  mapping  [13].  Complications 
in  applying  a  simple  two-layer  net  without  hidden  neurons  to  the  XOR  mapping  problem 
arise  in  mapping  quite  different  patterns  (11  and  00)  to  identical  output  (0),  as  well  as 
in  mapping  quite  similar  patterns  (01  and  10)  into  identical  output  (1).  Such  pair  of 
mappings  arc  quite  contradictory  and.  by  definition,  are  ill-posed.  (For  example,  in  inverse 
scattering,  the  mapping  (inverse)  is  known  to  be  ill-posed  if  the  solution  of  the  mapping  or 
reconstruction  does  not  exist  or  is  sensitive  to  noise  in  the  input  data.)  In  the  XOR  problem 
in  a  two-laver  neural  net.  a  network  to  perform  the  mapping  cannot  be  found:  thus  it  is  an 
ill-posed  problem  since  no  solution  for  the  problem  exists. 

On  the  other  hand,  a  layer  of  hidden  neurons  inserted  between  the  input  and  output 
lavers  of  a  simple  two-layer  network  will  enable  the  network  to  perforin  arbitrary  mapping 
from  input  to  output  via  the  hidden  neurons,  if  an  adequate  number  of  hidden  neurons  are 
utilized  [8],  [13],  It  can  easily  be  verified  that  the  network  with  a  single  hidden  neuron  shown 
in  Fig.  3  can  perform  the  XOR  mapping  mentioned  above.  This  network  overcomes  the 
difficulty  encounted  in  a  2-layer  net  by  using  a  hidden  neuron  to  change  the  quite  different 
input  patterns  into  patterns  with  sufficient  simplicity  as  seen  by  the  output-layer  neuron: 
it  accomplishes  the  task  by  using  one  hidden  neuron  for  a  two-bits  to  one-bit  mapping. 
The  required  weights  of  synaptic  connections  among  the  neurons,  indicated  in  Fig.  3  by 


the  number  on  the  arrows,  a se  ultimately  determined  through  learning  (see,  for  example, 
[8]-[l  1] )-  The  numbers  in  the  circles  represent  the  required  thresholds  of  the  neurons,  which 
are  assumed  here  to  be  fixed.  All  the  neurons  in  the  net  are  assumed  to  have  only  two  states: 
on  (1)  or  off  (0).  The  hidden  neuron  has  output  1  (on)  only  when  both  input  neurons  have 
states  1;  otherwise  it  has  output  0  (off).  The  output  neuron  will  be  turned  on  when  it 
has  a  net  positive  input  greater  than  0.5;  the  output  neuron  will  be  turned  off  (net  input 
smaller  than  0.5)  by  the  hidden  neuron  output  through  the  synaptic  connection  weight  of 
-3.0  when  both  input  neurons  are  on.  From  the  point  of  view  of  the  output  neuron,  the 
inputs  to  it  are  quite  similar  when  the  input  neurons  are  on  (11)  or  off  (00).  Thus,  the  role 
of  the  regularization  or  constraint  function  played  by  the  hidden  neuron  is  to  change  the 
degree  of  similarity  among  the  input  patterns  corresponding  to  the  same  output  pattern. 
This  role  can  be  considered  to  be  the  same  as  that  of  regularization  functions  for  ill-posed 
problems. 

The  regularization  role  played  by  hidden  neurons  can  als^  be  appreciated  from  the 
error  back-propagation  (EBP)  algorithm,  in  which  hidden  neurons  are  used  [8]-[ll].  The 
EBP  algorithm  for  a  general  problem  is  also  formulated  so  as  to  minimize  the  error  energy 
function, 

E  =  \\0-0\\2  (21) 

where  O  is  the  specified  or  the  desired  output  and  O  is  the  output  of  the  network  for  a 
given  input.  For  the  given  input  and  the  specified  output,  the  error  signal  given  by  E  is 
fed-back  (or  back-propagated)  into  the  network  to  adjust  the  interaction  weights  (weights  of 
svnaptic  connections)  among  all  neurons,  including  hidden  neurons.  This  learning  procedure 
is  iterated  until  a  set  of  weights  is  arrived  at  for  which  the  specified  output,  or  equivalently, 
the  specified  minimum  of  the  energy  function,  is  reached.  Comparison  of  the  energy  function 
in  (8)  with  that  in  (21)  shows  there  is  no  regularization  operation  involved  in  (21).  It  is  well 
known  that  inversions  by  minimizing  the  error  energy  function  of  the  form  shown  in  (21)  in 
the  presence  of  noise  are  ill-posed  and  that  the  outputs  are  usually  not  stable  with  respect 
to  the  inputs.  From  our  simulation  results  obtained  by  networks  with  hidden  neurons,  it 
is  found  that  the  performance  of  the  networks  is  quite  robust  with  respect  to  inputs.  This 
demonstrates  further  that  the  role  played  by  the  regularization  operator  in  (8)  to  constrain 
the  output  in  ill-posed  mapping  problems  is  achieved  using  the  hidden  neurons  in  neural 
networks.  Impossible  mappings  in  a  neural  network  can  be  made  possible  by  increasing 
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the  number  of  hidden  neurons;  this  can  be  explained  by  the  fact  that  regularization  is 
introduced  or  further  enforced  by  the  increase  in  number  of  hidden  neurons. 

5  Reconstruction  by  Neural  Nets  Through  Learning 

The  iterative  neural  net  equation  ( 10)  can  be  cast  in  a  closed  form  of  a  non-iterative  equation 
and  implemented  with  a  non-iterative  processor  when  an  adaptive  threshold  (13)  that  is  a 
linear  function  of  neural  states  is  used.  On  the  other  hand,  when  an  adaptive  threshold  (16) 
that  is  a  non-linear  function  of  neural  states  is  used,  the  iterative  neural  net  equation  (17) 
can  not  be  written  in  the  closed  form  of  a  non-iterative  equation.  There  is  no  known  method 
to  directly  implement  the  iterative  equation  with  a  non-iterative  processor;  this  results  from 
the  difficulty  of  choosing  a  different  regularization  R(o )  and  a  different  parameter  a  in  (8) 
for  a  different  reconstruction  problem,  since  the  first  term  on  the  right  hand  side  of  (8) 
can  be  computed  with  a  non-iterative  DFT  processor.  This  difficulty  can  be  overcome  by  a 
neural  net  through  learning  that  enables  formation  of  R(o)  and  a  automatically,  depending 
on  the  image  to  be  reconstructed. 

Hidden  neurons  have  been  shown  to  have  a  regularization  effect  in  last  section.  Hence 
a  hidden  neural  layer  will  be  used  here  for  the  purpose  of  regularization,  overcoming  the 
ill-posedness  of  image  reconstruction  from  partial  frequency  response.  A  three-layer  neural 
net  with  feedforward  connections  for  image  reconstruction  is  schematically  shown  in  Fig. 
4.  The  input  laver  takes  the  frequency  responses  from  measurements,  and  complex  neurons 
(i.e.,  their  states  are  complex  and  equal  to  the  real  and  imaginary  values  of  the  measured 
complex  frequency  response)  in  the  input  layer  are  connected  to  neurons  in  both  the  output 
layer  and  the  hidden  layer.  The  synaptic  connection  of  neurons  in  the  input  layer  to 
neurons  in  either  the  output  layer  or  the  hidden  layer  are  complex  and  will  be  fixed  and 
taken  as  the  Fourier  weights  for  the  image  reconstruction  problems  in  situations  in  which 
the  measurement  data  and  the  image  to  be  reconstructed  have  a  Fourier  transform  relation. 
The  number  of  neurons  in  ail  three  layers  are  assumed  to  be  the  same,  for  the  moment,  and 
to  equal  the  number  of  frequency  points  at  which  the  response  is  measured.  Images  to  be 
reconstructed  are  assumed  to  be  normalized  to  unity  and  the  output  from  neurons  in  the 
hidden  laver  will  take  a  nonlinear  function  of  the  form  tanh(  j.  Mathematically,  the  final 


1 1 


output  neural  state  representing  the  image  to  be  reconstructed  is, 


o(i)  =  z(  i)  +  tanh 


where  rt;  is  real-valued  synaptic  link  between  the  ith  neuron  in  the  output  layer  and  the 
jth  neuron  in  the  middle  (hidden)  layer  and, 


■  N 

Z(l)  =  £  ^\VlkFm(k)  l=i,J 

■  k= 1 


where  ??(•]  represents  the  real  part  of  the  bracketed  quantity  and  W, ik  are  the  Fourier  weights. 
Once  more,  a  real  object  function  o(i)  is  assumed  for  microwave  radar  imaging  [3];  and  z(l) 
is  recognized  as  the  real  part  of  the  Fourier  inversion  of  the  measured  frequency  data  Fm. 

Learning  in  the  neural  net  involves  determining  the  synaptic  weights  rtJ  by  an  error 
back-propagation  algorithm  [S]-[  11]-  With  an  error  back-propagation  algorithm,  the  neural 
network  can  be  made  to  learn,  under  supervision,  to  perform  extrapolations  and  recon¬ 
structions  as  follows:  for  a  given  desired  or  ideal  object  function  jD,  when  the  measured 
frequency  response  Fm(p)  is  fed  into  the  network  in  Fig.  4  and  the  output  from  the  network 
denoted  as  o,  an  error  function, 

E=\\D-o\\>  =  ±Y,\D(i)-o(i)\*  (24) 

I 

can  be  defined.  Since  knowledge  of  the  desired  object  function  D  at  the  output  of  the  net  is 
required  ( D  is  also  the  ideal  desired  image  at  the  output),  the  learning  is  supervised.  Using 
the  chain  differentiation  rule,  the  change  of  the  error  function  with  respect  to  the  change 
of  weight  ru  can  be  written  as. 

dE  _  dE  do{ i ) 
dr,j  do(i )  Ot,j 

From  equation  ( 24 ), 

a  r 

— — -  =  ~[D(i)  -  o(i)]  =  -6,  (26) 


and  from  equation  (22). 


=  tanh'  Y.r.zU)  z(j) 

Or,j 


=  ={j)/co  sh2 
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Combining  now  (25),  (26)  and  (27),  the  following  equation  is  obtained, 


TP -  =  -  fo{j)/  cosh2 
Or  * ; 


(28) 


To  reduce  the  error  signal  in  (24),  the  weight  r,;  can  be  changed  through  gradient  descent 
by  an  amount, 


V 


dE 

drX] 


=  r]6,z(j)/  cosh2 


(29) 


L  J  J 

with  q  being  a  constant  controlling  the  learning  rate. 

The  above  procedure  is  for  one  given  object  (or  pattern)  function  D.  When  there  are  M 
ideal  images  of  interest,  the  procedure  is  carried  out  M  times,  once  for  each  image.  For  each 
image  the  error  signal  is  checked  and  if  a  specified  error  criterion  (to  be  specified  below) 
is  not  satisfied,  the  procedure  is  repeated  again  for  every  pattern;  this  is  done  repeatedly 
until  the  error  signal  criterion  is  satisfied  for  each  image. 


6  Simulation  Results  and  Robustness  Tests 


Simulations  were  carried  out  to  verify  the  learning  concept  presented  above.  Several  ideal 
object  functions  of  spatial  extent  within  [0,4]  cm  are  used.  The  number  of  neurons  for 
the  input,  middle,  and  output  layers  axe  assumed  to  be  equal  to  21  for  each  layer.  The 
small  number  of  neurons  used  and  the  small  extent  ([0,4]  cm)  of  the  function  occupied 
are  all  chosen  for  the  purpose  of  containing  the  computations  involved,  but  they  can  be 
increased  or  altered  at  will  to  any  desired  value.  The  frequency  response  of  the  object 
function  chosen  is  synthesized  (computed  digitally)  in  the  6-17  GHz  range  and  subjected 
in  simulation  to  the  action  of  the  network  in  Fig.  4.  The  network  can  determine  a  set  of 
ri;  links  for  a  given  set  of  functions  to  produce  correct  patterns  within  the  specified  error 
criterion  max|Z?(i)  -  o(i)|  <  0.097. 

I 

For  one  of  our  simulations,  done  for  a  set  of  two  object  functions,  the  first  object  function 
is. 


o\(t) 


1.0  r  6  [0.2,  1.2](cm) 

0  r  £  [0.0.2)(cm)  or  r  E  ( 1.2,4. 0](cm) 


(30) 
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The  second  is, 


1.0  r  €  [2.2,3.2](cm) 

02(r)=<  ,  (31) 

[  0  r  £  [0,2.2)(cm)  or  r  6  (3.2, 4.0](cm) 

These  two  functions,  shown  in  Figs.  5(a)  and  (b),  respectively,  have  spatial  extents  within 
0-4  cm.  The  frequency  responses  of  the  two  object  functions  synthesized  over  the  frequency 
window  6-17  GHz  are  shown  in  Figs.  6(a)  and  (b),  respectively.  If  the  DFT  inversion 
method  is  applied  to  the  frequency  data  in  Fig.  6,  a  low  resolution  image  with  most  of 
its  intensity  concentrated  around  the  sharp  edge  of  the  object  functions  will  be  obtained 
because  of  the  relatively  high  frequency  window.  Fig.  7  shows  the  reconstruction  of  the  first 
object  function  from  the  partial  frequency  domain  data  in  Fig.  6(a)  by  the  DFT  method. 
This  reconstruction  shows  that  there  is  a  relatively  broad  positive  pulse  at  the  position  of 
the  rising  edge  of  the  original  object  function  and  a  broad  negative  pulse  at  the  position  of 
the  falling  edge  of  the  original  object  function;  the  two  pulses  are  of  different  amplitude  even 
though  the  given  object  function  has  the  same  rising  and  falling  edges.  When  the  two  object 
functions  are  alternately  presented  to  the  network  in  Fig.  4  and  the  synaptic  connections 
are  changed  according  to  (29),  the  learning  process  gradually  converges  and  a  set  of  synaptic 
connections  is  learned  by  the  network,  enabling  it  to  provide  near-perfect  reconstructions 
of  the  object  functions  within  the  specified  error  criterion  when  the  frequency  response  of 
either  object  function  is  presented  to  the  network.  The  network  accomplishes  the  learning 
in  just  five  learning  cycles,  defined  as  the  process  of  presenting  the  two  patterns  to  the 
network  once  and  modifying  the  weights  following  each  pattern  presentation. 

Figure  8  shows  the  outputs  of  the  network  for  several  typical  learning  cycles  and  demon¬ 
strates  'W  the  network  gradually  learns  the  two  patterns  by  adjusting  its  connection 
weights.  Shown  in  Fig.  8(a)  are  the  outputs  of  the  network  for  the  first  pattern  (left 
side)  and  for  the  second  pattern  (right  side)  after  the  network  has  been  trained  with  the 
first  pattern  only  during  the  first  learning  cycle.  It  is  seen  from  Fig.  8(a)  that  the  output 
from  the  network  for  the  first  pattern  as  input  is  near-perfect  and  the  output  for  the  second 
pattern  as  input  resembles  more  the  first  pattern  rather  than  the  second;  this  is  under¬ 
standable,  since  the  network  has  as  yet  learned  only  the  first  pattern.  Completing  the  first 
learning  cycle  by  training  the  net  next  with  the  second  pattern,  we  find  the  network  is  able 
to  give  a  near-perfect  reconstruction  of  the  second  pattern  as  input  (right  side.  Fig.  8(b)). 
When  the  first  pattern  is  presented,  the  output  is  altered,  becoming  more  like  a  superposi¬ 
tion  of  the  first  pattern  and  the  second  pattern.  This  occurs  because,  during  the  learning 
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of  the  second  pattern,  the  network  loses  some  of  its  previous  internal  representation  of  the 
first  pattern.  The  internal  representation  of  the  first  pattern  is  restored,  however,  in  the 
second  learning  cycle  following  the  presentation  of  the  first  pattern  again  to  the  net.  The 
output  (left  side.  Fig.  8(c))  from  the  network  for  the  first  pattern  as  input  again  approaches 
a  near-perfect  reconstruction  and  the  output  (right  side,  Fig.  8(c))  for  the  second  pattern 
as  input  is  much  better  than  that  obtained  during  the  first  learning  cycle  (right  side.  Fig. 
8(a)).  This  result  is  also  understandable  since,  so  far,  the  network  has  been  trained  with  the 
first  pattern  twice  (during  the  first  and  second  learning  cycles)  and  with  the  second  pattern 
once  only  (during  the  first  learning  cycle).  The  output  for  the  second  pattern  (right  side. 
Fig.  8(d))  is  improved  during  the  second  learning  cycle  after  presenting  the  second  pattern 
to  the  network  for  learning;  once  again,  this  degrades  the  performance  of  the  network  in 
recognizing  the  first  pattern  (left  side.  Fig.  8(d)).  By  repeatedly  and  alternately  presenting 
the  two  patterns  to  the  network  for  learning,  the  network  gradually  adjusts  its  intercon¬ 
nection  weights  to  improve  the  reconstructions  for  both  patterns.  Shown  in  Figs.  8(e)  and 
(f )  are  the  outputs  of  the  network  during  the  third  learning  cycle  after  the  first  and  second 
patterns  have  been  presented  to  the  network,  respectively;  the  performance  of  the  network 
is  seen  to  have  improved  in  comparison  with  the  corresponding  cases  in  the  second  learning 
cycle.  After  the  first  pattern  has  been  presented  to  the  network  for  learning  during  the 
fourth  learning  cycle,  the  outputs  for  both  patterns  are  much  better  (Fig.  8(g)),  except  for 
the  presence  of  some  side  lobes  for  the  second  pattern  as  input  (right  side,  Fig.  8(g)).  The 
side  lobe  level  is  reduced  to  the  specified  tolerable  error  range  of  max|J9(t)  -  o(z)|  <  0.097 
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during  the  fifth  learning  cycle.  Fig.  8(h)  shows  the  outputs  of  the  network  for  both  patterns 
after  the  network  has  been  presented  with  the  first  pattern  for  learning  during  the  fifth  or 
the  final  learning  cycle. 

How  to  choose  the  learning  rate  rj  is  critical  to  the  speed  of  the  learning  process.  The 
range  of  suitable  learning  rates  can  be  analytically  determined  for  learning  algorithms  in¬ 
volving  a  linear  function  of  neural  states  [14].  For  the  learning  algorithm  involving  a  non¬ 
linear  function  of  neural  states  given  in  (29),  it  is,  however,  hard  to  analytically  determine 
the  range  of  the  learning  rate.  By  inspecting  (29).  it  is  seen  that  the  learning  rate  r?  repre¬ 
sents  the  proportion  by  which  the  synaptic  weight  changes  in  accordance  with  the  output 
error  induced  by  the  current  synaptic  weights  themselves.  In  our  preceding  simulations,  the 
learning  rate  chosen  is  usually  r?  =  0.99.  As  indicated  elsewhere,  it  would  not  make  sense  to 
have  the  learning  rate  rj  greater  than  1.  since  this  could  “overcorrect"  output  error  [14]  —  a 
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phenomenon  that  has  been  observed  in  our  simulations.  By  “overcorrection",  we  mean  that 
the  output  error  (energy)  being  minimized  exhibits  oscillations  and  sometimes  is  increased. 
Overcorrection  usually  results  in  a  longer  convergence  time.  On  the  other  hand,  making 
the  learning  rate  too  small  could  also  slow  down  the  learning  process.  Another  cautionary 
remark  in  carrying  out  the  learning  process  is  that  the  initial  synaptic  weights  should  not 
be  equal;  otherwise,  the  network  would  obtain  identical  weights  for  all  synaptic  connections 
[8].  The  initial  synaptic  weights  in  our  study  were  chosen  randomly. 

More  complex-shaped  object  functions  were  also  used  to  test  the  learning  and  recon¬ 
struction  capability  of  the  neural  net  in  Fig.  4.  A  set  of  two  object  functions  is  shown  in 
Fig.  9.  The  first  function  has  a  spatial  extent  of  0.2-0. 8  cm  (Fig.  9(a))  similar  to  that 
shown  in  Fig.  5(a).  The  second  function  is  of  a  more  complicated  shape.  The  first  part 
of  this  function  is  a  pulse  of  0.8  cm  in  width  and  the  second  part  is  of  a  triangular  shape. 
After  a  set  of  synaptic  weights  is  learned  by  the  network  by  presenting  the  two  patterns 
to  the  net  five  times,  the  network  is  able  to  give  a  near-perfect  reconstruction  when  the 
frequency  response  of  either  function  is  presented  to  it.  The  reconstructions  of  the  two 
object  functions  by  the  network  are  shown  in  Fig.  10.  Comparing  Fig.  10(b)  and  Fig. 
9(b)  shows  that  the  reconstruction  of  the  triangular  portion  of  the  second  object  function 
is  perfect;  since  the  triangular  part  of  the  second  function  resembles  more  the  undulations 
of  a  continuous  function,  its  perfect  reconstruction  implies  the  network  performs  better  for 
continuous  functions. 

Generalizations  and  Robustness:  The  two  simulations  presented  above  have  shown 
good  results  when  the  network  is  used  for  reconstructions  of  object  functions  that  it  has 
been  presented  with  during  the  learning  process.  Generalization,  which  deals  with  the  per¬ 
formance  of  a  network  when  inputs  are  similar  to,  but  not  specifically  among,  the  training 
sets  the  net  has  been  presented  with  during  the  learning  process,  is  an  issue  of  practical 
importance  [14].  Generalization  for  extrapolations  and  reconstructions  from  partial  fre¬ 
quency  information  is  studied  here  from  the  point  of  view  of  the  network's  performance 
with  noise-contaminated  frequency  response  input  data. 

Based  on  the  discussion  in  Section  4.  it  can  be  appreciated  that  hidden  neurons  play 
a  certain  regularization  role,  and  that  such  regularization  makes  the  solution  stable  for 
problems  of  extrapolations  and  reconstructions  from  partial  frequency  information.  Nu¬ 
merical  simulations  were  conducted  to  verify  that  the  network  with  hidden  neurons  in  Fig. 
4  provides  sufficient  regularization  and  is  capable  of  giving  stable  and  robust  reconstruc- 
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tions  even  in  the  presence  of  noise.  One  of  these  simulations  was  done  with  the  test  object 
functions  shown  in  Fig.  5.  The  frequency  responses  of  the  two  object  functions  in  Fig.  5 
were  contaminated  with  Gaussian  noise  with  the  following  distribution  function, 

G(N)  =  -^-e~^^  (32) 

V2  7T<7 

where  N  represents  the  noise  amplitude,  and  a2  is  the  variance  of  Gaussian  noise.  Defining 
the  signal-to-noise  ratio  (SNR)  as, 

average  sicnal  energy  in  the  given  frequency  band 
noise  variance 

=  - - -  f  \F(p)\2dp/a 2  (33) 

Pi  P\  Jpi 

we  find  that  when  SNR=5,  the  noise-contaminated  frequency  responses  for  the  two  ob¬ 
ject  functions  are  as  shown  in  Fig.  11  for  the  frequency  band  6-17  GHz  corresponding  to 
p  £  [2.5, 7.1](cm-1 )-  The  difference  before  and  after  noise  contamination  can  be  seen  by 
comparing  Fig.  6  and  Fig.  11.  Even  though  the  frequency  responses  in  Fig.  11  after  noise 
contamination  differ  appreciably  from  the  noise-free  frequency  responses  in  Fig.  6,  the  net¬ 
work,  which  learned  a  set  of  synaptic  connections  using  the  noise-free  frequency  information, 
is  still  able  to  yield  reconstructions  of  high  quality,  as  shown  in  Fig.  12.  The  reconstruc¬ 
tions  in  Fig.  12  from  the  noise-contaminated  frequency  information  show  a  weak  side-lobe 
structure  compared  with  the  reconstructions  in  Fig.  8(h),  where  noise-free  frequency  in¬ 
formation  is  used  as  input.  When  the  SNR  is  further  decreased,  the  side  lobe  structure 
in  the  reconstructions  from  noise-contaminated  frequency  information  will  increase.  The 
reconstruction  from  noisy  frequency  response  data  can  be  improved  by  training  the  network 
with  noise-free,  as  well  as  some  noise-contaminated  frequency  data.  For  studies  with  the 
two  test  patterns  considered  here,  the  network  was  trained  with  the  noise-free  frequency 
data  shown  in  Fig.  6,  and  also  with  the  noisy  frequency  responses  (SNR  =  1)  shown  in  Fig. 
13.  The  ideal  patterns  needed  in  the  supervised  learning  process  for  the  noise-free  and  noisy 
data  were  specified  to  be  the  same  as  those  shown  in  Fig  5.  The  noise-free  data  and  the 
noisy  data  were  presented  alternately  to  the  net  to  adjust  the  connection  weights  until  the 
specified  error  criterion  max|D(i)  -  o(i)|  <  0.097  for  every  pattern  was  reached.  When  the 

t 

resulting  network  is  tested  using  noisy  frequency  response  data  (SNR=5)  as  input  after  the 
stated  training,  the  outputs  from  the  network  are  as  shown  in  Fig.  14.  Comparing  Fig. 
14  with  Fig.  12  shows  a  clear  improvement  of  the  side-lobe  structure,  the  result  of  mixing 
instances  of  noisy  and  noise-free  data  sets  in  training  the  network.  In  practice,  a  network. 
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being  trained  with  examples  of  data  from  its  environment,  is  expected  to  encounter  differing 
levels  of  SNR.  The  findings  above  suggest  that  this  could  be  beneficial  for  enhancing  the 
performance  of  the  net. 

7  Radar  Target  Identification  by  Layered  Networks 

The  preceding  discussion  shows  that  robust  extrapolation  and  near-perfect  reconstruction 
can  be  achieved  with  layered  nonlinear  networks.  An  interesting  issue  is  whether  there 
always  exists  a  network  that  can  do  extrapolations  and  reconstructions  for  a  given  finite 
number  of  functions  or  patterns  of  interest.  A  theorem  concerning  multi-layer  neural  net¬ 
works,  which  simply  states  that  a  multi-layer  network  with  sufficient  number  of  hidden 
neurons  is  able  to  perform  any  kind  of  mapping  from  input  to  output  [8], [15],  makes  it 
possible  for  the  network  shown  in  Fig.  4  to  perform  extrapolations  and  reconstructions  of 
any  finite  number  of  functions  of  interest,  if  enough  hidden  neurons  are  used  in  the  network. 
For  a  finite  number  of  aerospace  targets,  a  two-dimensional  object  function  describing  the 
geometrical  shape  of  each  target  can  be  formed  from  the  one-dimensional  functions  recon¬ 
structed  by  a  learning  net,  as  described  in  the  last  section,  through  extrapolation  of  partial 
frequency  response  data  acquired  for  fixed  aspects  of  the  targets  over  a  sufficiently  wide 
range  of  aspect  angles  [3].  The  two-dimensional  image  obtained  in  this  fashion  can  provide 
sufficiently  high  resolution  through  data  acquisitions  over  a  wide  range  of  aspect  angles 
and  extrapolations  of  the  measured  frequency  response  data  for  every  aspect.  Such  high 
resolution  images,  like  those  shown  in  Fig.  1,  would  enable  a  human  observer  to  recognize 
and  identify  the  target.  Another  more  attractive  and  less  involved  concept  in  target  iden¬ 
tification  does  not  involve  forming  am  image.  It  provides  for  target  identification  from  an 
identifying  label  of  the  target  generated  by  a  neural  net  automatically  from  input  informa¬ 
tion  (i.e.,  frequency  response  data)  belonging  to  that  target  [16].  This  approach  is  necessary 
in  situations  where  aspect  information  (frequency  response  echos  for  various  aspects)  of  the 
target  can  not  be  obtained  over  a  sufficiently  wide  range  of  aspect  angles  because  of  prac¬ 
tical  limitations  and  a  high-resolution  image  of  the  target  consequently  can  not  be  farmed 
[16].  The  issue  then  is  that  of  radar  target  identification  from  a  single  frequency  response 
echo  for  any  practical  aspect  of  the  target,  or  a  few  such  echos,  using  a  layered  nonlinear 
network  through  self-organization  and  learning. 

The  traditional  approach  in  nonimaging  radar  target  recognition  has  been  to  extract 


18 


from  suitably  formed  radar  echos  characteristic  features  or  signatures  of  the  targets  and 
to  compare  these  with  a  library  of  such  signatures  [17].  This  kind  of  approach  is  basically 
a  parametric  estimation  method  and  makes  certain  assumptions  about  the  form  of  the 
return  signals  or  echos  as  expressed  by  several  parameters.  The  extraction  of  the  assumed 
parameters  used  in  the  approach  is  usually  sensitive  to  noise  [18]  and  there  is  no  adaptation 
involved. 

The  network  used  for  target  recognition  in  our  work  is  shown  in  Fig.  15.  This  network 
is  a  variation  of  the  network  used  in  Fig.  4  for  extrapolations  and  reconstructions.  In 
the  network  in  Fig.  4,  which  was  shown  to  be  robust  in  extrapolations  and  reconstruc¬ 
tions  from  partial  information,  the  number  of  output  neurons  was  equal  to  the  number  of 
samples  representing  the  function  to  be  reconstructed.  In  the  network  shown  in  Fig.  15, 
intended  to  perform  robust  target  recognition  from  partial  information,  the  number  of  out¬ 
put  neurons  is  chosen  to  allow  forming  enough  distinguishable  labels  to  represent  all  targets 
of  interest.  Using  labels  instead  of  object  functions  makes  learning  easier,  since  the  ideal 
object  functions  that  are  needed  to  accomplish  learning  for  extrapolations  and  near-perfect 
reconstructions,  and  that  are  not  easy  to  obtain  for  aerospace  targets  in  general,  are  now 
not  required.  Since  label  representations  rather  than  object  functions  of  targets  are  to  be 
used  for  identification  in  this  case,  no  direct  connections  between  output  neurons  and  input 
neurons  in  Fig.  15  are  used,  this  simplifying  the  structure  of  the  network.  As  before,  the 
connections  from  input  neurons  to  hidden  neurons  accomplish  Fourier  mapping,  i.e., 

.v 

Z(j )  =  (34) 

k  =  1 

where  W:k  represents  the  Fourier  weight  for  inverting  the  known  (measured)  partial  fre¬ 
quency  domain  information  Fm(k).  For  target  recognition  from  other  than  frequency  do¬ 
main  information,  the  weights  \V}k  are  set  up  in  accordance  with  the  applicable  transform, 
or  else  they  are  determined  through  training.  The  input  to  an  output  neuron  in  Fig.  15  is 
given  by, 

“•  =  H  rO'(J)  (35) 

j 

where  rX]  again  represents  the  weight  from  neuron  j  in  the  hidden  layer  to  neuron  i  in  the 
output  layer  to  be  determined  by  learning.  The  output  neuron  state  is  now  given  by  the 
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expression, 


,  ,  I  1  for  u,  >  0 

o(i)  =  f/[tanh(u,)]  =  <  t=l,2,--,M  (36) 

[  0  for  u,  <  0 

where  (/[■]  is  the  unit  step  function.  The  form  f/[tanh(u,)]  is  used  in  (36)  to  show  more 
clearly  the  nonlinear  summation  input  to  the  output  layer,  as  well  as  the  evolution  of  the 
circuit  in  Fig.  15  from  that  of  Fig.  4.  Different  targets  are  represented  by  different  output 
states. 

Two  groups  of  test  targets  were  used  in  our  study:  the  first  group  contains  a  100  :  1 
scale  model  of  a  B-52  aircraft  and  a  150  :  1  scale  model  of  a  Boeing  747  airplane;  the  second 
group  contains  a  75  :  1  scale  model  of  a  space  shuttle  in  addition  to  the  two  scale  models  in 
the  first  group.  Sketches  of  all  three  scale  models  with  their  actual  dimensions  are  shown 
in  Fig.  16.  It  can  be  noted  that  the  shapes  of  the  Boeing  747  and  the  space  shuttle  are 
relatively  less  complex  than  that  of  the  B-52  airplane.  Two  output  neurons  are  used  to 
provide  label  representations  for  the  three  aerospace  target  models;  two  output  neurons 
can  usually  provide  labels  for  22  (=  4)  distinct  patterns.  The  state  (0,0)  of  the  output 
neurons  in  the  network  shown  in  Fig.  15  is  left  idle  to  indicate  the  case  in  which  there  is 
no  information  input  to  the  network. 

For  practical  applications  of  radar  target  identification,  it  would  be  necessary  to  examine 
the  performance  of  the  network  for  all  possible  aspects  of  the  target  that  could  be  encoun¬ 
tered  by  the  observer  (the  radar  system),  a  process  that  entails  massive  data  collection  and 
storage.  Because  of  the  limitations  of  our  experimental  facility,  frequency  response  data  for 
the  targets  are  collected  for  only  a  limited  range  of  aspect  angles  extending  over  a  range  of 
20°  in  azimuth  from  a  head-on  (0°)  view  of  the  targets  to  20°  towards  the  broad-side  view  of 
the  targets.  The  elevation  angle  of  the  target  was  fixed  at  15°  relative  to  the  horizontal.  The 
results  obtained  with  this  limited  data  set  are.  however,  quite  telling  and  representative  of 
what  can  be  expected  with  larger  libraries  of  frequency  responses  covering  all  target  aspects 
of  interest.  Frequency  domain  data  are  collected  for  100  aspect  views  equally  spaced  over 
the  20°  range  for  each  target,  representing  a  separation  of  0.2°  between  adjacent  views. 

The  network  in  Fig.  15.  designed  for  target  identification,  was  first  presented  with 
frequency  response  data  from  a  certain  percentage  of  the  100  aspect  views  of  the  targets 
to  allow  learning  to  take  place.  Each  target  is  assigned  a  label:  (0,1)  for  the  B-52;  (1,0) 
for  the  Boeing  747;  and  (1.1)  for  the  space  shuttle.  A  total  of  101  frequency  points  were 
collected  over  the  band  6.5-17.5  GHz  for  each  aspect  view;  the  number  of  neurons  chosen 
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in  the  input  and  hidden  layers  was  also  101.  Learning  is  carried  out  via  the  error  back- 
propagation  algorithm  described  in  Section  5.  which  enabled  adjusting  the  connection 
weight  rtJ  between  the  output  neuron  and  the  hidden  neuron.  When  the  frequency  response 
of  a  target  for  a  specific  aspect  angle  is  presented  to  the  network,  the  network  iteratively 
adjusts  the  weight  r,;  by  error  back- propagation  until  the  desired  label  for  the  target  is 
produced  by  the  network.  The  training  data  (frequency  response  for  different  aspects  or 
views)  are  presented  in  turn  to  the  network  for  each  target;  all  targets  of  interest  are  learned 
by  the  network  in  turn.  The  process  of  presenting  all  the  training  data  for  all  targets  once 
constitutes  one  learning  cycle.  The  maximum  number  of  iterations  required  for  the  network 
to  learn  specific  targets  of  the  types  used  in  our  study  was  7  at  the  start  of  the  learning 
process  ,  but  this  number  decreased  as  learning  progressed  or  as  the  number  of  learning 
cycles  increased.  Once  the  network  has  assimilated  and  learned  the  correct  representations 
for  all  targets,  the  learning  process  is  terminated.  The  maximum  number  of  learning  cycles 
observed  for  the  network  to  learn  all  targets  was  8. 

Fig.  17  shows  the  performance  of  the  network  for  the  first  group  of  targets,  the  B- 
52  and  the  Boeing  747  scale  models.  The  curves  in  Fig.  17  indicate  the  probability  of 
correct  recognition  by  the  network  of  the  two  targets  with  respect  to  the  percentage  of 
the  total  100  aspect  views  collected  that  were  used  for  training.  The  training  set  can  be 
selected  deterministically,  i.e.,  in  a  given  order,  or  randomly  from  the  set  of  100  aspect  views 
characterizing  each  target.  The  criterion  for  choosing  the  training  set  is  to  make  sure  that 
information  about  the  target  is  evenly  represented.  For  example,  the  deterministic  selection 
case  of  50  percent  of  the  available  aspect  views  as  the  training  set  can  be  formed  by  selecting 
every  other  aspect  view,  i.e.,  all  the  even-  (or  odd-)  numbered  views  out  of  the  total  100 
available  aspect  views.  For  the  random  selection  case,  the  training  set  can  be  formed  by 
selecting  aspect  views  out  of  the  total  angular  window  of  20°  with  even  probability.  Our 
study  shows  that  the  performance  of  the  network  is  virtually  unaffected  by  the  method  of 
selection  for  the  training  set  and  at  most  a  1%  discrepancy  in  results  for  the  two  methods 
of  selection  is  observed.  In  order  to  test  the  performance  of  the  network  after  it  has  been 
trained,  all  100  aspect  views  are  used.  While  a  certain  percentage  of  the  test  set  would  have 
been  used  in  training  the  network,  the  remainder  of  the  test  set  would  not  have  been  seen 
by  the  network  before.  When  10 %  of  the  total  available  views,  or  equivalently,  when  views 
with  roughly  2°  angular  separation  are  used  for  training,  the  network  achieves  only  54% 
correct  recognition  for  the  B-52  and  72%  for  the  Boeing  747,  even  though  the  incremental 
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spacing  between  viewing  angle  for  the  given  set  of  data  is  small  (0.2°).  The  performance  of 
the  network  improves  nonlinearly  as  the  percentage  of  views  used  for  training  is  increased. 
Because  the  shape  of  the  Boeing  747  is  less  complex  than  that  of  the  B-52,  the  network  is 
able  to  capture  its  underlying  structure  in  its  internal  representation  (  the  rtJ  weights)  much 
faster,  allowing  for  better  recognition.  The  network  reaches  90%  correct  recognition,  when 
the  percentage  of  views  used  for  training  increases  to  40%  for  the  B-52  and  20%  for  the 
Boeing  747,  or  when  the  minimum  angular  spacing  between  adjacent  views  in  the  training 
set  is  approximately  0.5°  for  the  B-52  and  1°  for  the  Boeing  747.  When  the  percentage  of 
views  used  for  training  for  both  targets  increases  to  60%,  the  network  can  recognize  more 
than  98%  of  the  testing  aspect  views  presented  to  it  correctly. 

For  the  network  shown  in  Fig.  15,  with  the  connection  weights  from  the  input  layer 
neurons  to  the  hidden  layer  neurons  fixed  as  Fourier  weights,  the  input  to  the  hidden  layer 
can  be  interpreted  as  the  real  part  of  the  Fourier  inverse  of  the  measured  frequency  response 
data  Fm  for  one  aspect  view.  This  input  (range-profile)  to  the  hidden  layer  bears  information 
such  as  the  rough  extent,  shape,  fine  structure,  etc.,  of  the  target  as  seen  from  that  aspect 
angle  [3].  During  training,  the  network  extracts  common  features  or  certain  correlations 
from  the  training  data  to  form  a  representation  for  the  target  by  adjusting  its  weight  rtJ. 
When  the  network  is  tested  with  test  views,  the  portion  of  the  test  views  which  have 
not  been  presented  to  the  network  during  training  can  be  considered  as  noisy  versions  or 
“correlates''  of  the  training  set.  This  ability  of  the  net  to  generalize,  i.e.,  to  recognize  noisy  or 
correlated  data,  is  an  attractive  feature  of  neuromorphic  signal  processing.  The  range-profile 
data  in  various  aspect  views  of  a  complex  aerospace  target  can  differ  noticeably  from  one 
aspect  angle  to  another.  In  fact,  since  the  data  in  various  aspect  views  for  complex  shaped 
aerospace  targets  change  markedly  from  one  aspect  angle  to  another,  the  resemblance  or 
correlation  of  adjacent  views  for  some  aspect  angles  are  so  weak,  even  for  the  angular  spacing 
of  0.2°  used  in  our  data  acquisition,  that  the  network  fails  to  recognize  the  targets  perfectly 
(with  a  100%  score)  even  when  almost  all  the  views  are  used  for  training;  this  is  evident 
in  Fig.  17  by  the  fact  that  correct  recognition  for  both  targets  did  not  reach  100%  until 
100%  of  the  available  aspect  view  data  were  used  for  training.  The  results  plotted  in  Fig. 
17  show  that  the  average  probability  of  misrecognition  from  a  single-aspect  view  when  60% 
or  more  views  have  been  used  for  training  is  1%. 

Perfect  Recognitions:  The  probability  of  misrecognition  can  be  made  negligible  and 
even  reduced  to  zero  in  two  ways.  One  way  which  we  describe  here  is  to  use  more  than 


one  aspect  view  for  a  given  target  in  interrogating  the  network,  with  the  outcome  decided 
by  a  majority-decision  rule.  The  multi-aspect  views  for  recognizing  aerospace  targets  in  a 
practical  target  identification  system  could  be  readily  collected  and  presented  to  the  network 
as  targets  fly  by  the  system.  The  training  procedure  for  recognition  from  multi-aspect  views 
remains  the  same  as  that  used  for  recognition  from  a  single-aspect  view. 

Fig.  18  shows  the  performance  of  the  same  network  of  Fig.  15  in  recognizing  the 
first  group  of  targets  from  three,  rather  than  one,  aspect  views  after  the  network  has  been 
trained  with  the  available  training  set  of  aspect  views.  The  three  aspect  views  are  randomly 
selected  from  the  test  set  (100  views)  and  are  sequentially  fed  into  the  network;  the  outputs 
from  the  network  provide  the  three  labels  from  which  a  majority  vote  on  the  recognition 
outcome  can  be  determined.  There  were  33  groups  of  three  aspect  views  randomly  formed 
from  the  total  100  aspect  views  thus  ensuring  that  almost  every  aspect  view  is  included 
in  the  test.  Fig.  18.  which  displays  the  correct  recognition  percentages  with  respect  to 
these  33  groups,  shows  that  the  overall  performance  of  the  network  improves  by  a  factor 
of  about  10%  when  using  three  views  rather  than  a  single  view  for  interrogation.  The 
correct  recognition  performance  increases  much  faster  as  the  percentage  of  the  views  used 
for  training  increases.  The  network  now  reaches  100%  correct  recognitions  when  25%  of 
the  views  for  the  Boeing  747  and  35%  of  the  views  for  the  B-52  are  used  for  training.  The 
network  was  also  tes‘ed  with  the  second  group  of  targets  which  was  formed,  as  mentioned 
earlier,  by  adding  a  space  shuttle  scale  model  to  the  first  group  of  targets.  The  network 
was  trained  similarly  using  a  certain  percentage  of  the  total  available  aspect  views  from 
all  three  targets.  Fig.  19  shows  that  correct  recognition  performance  of  the  network  for 
the  space  shuttle  is  similar  to  that  for  the  Boeing  747.  From  a  practical  standpoint,  it 
makes  more  sense  to  evaluate  the  performance  of  the  net  by  using  multiple  aspect  views 
as  test  signals  combned  with  a  majority  vote  when  the  three  aspect  views  are  successive 
or  adjacent  to  each  other  rather  than  being  distributed  over  a  wide  range  of  aspect  angles. 
This  is  representative  of  situations  where  the  net  is  probed  with  three  successive  frequency 
responses  collected  from  a  target  as  the  target  changes  its  aspect  relative  to  the  measurement 
system  because  of  relative  motion.  In  our  study,  the  performance  of  the  network,  when  the 
three  apect  views  are  successive  or  adjacent  to  each  other,  was  found  to  be  similar  to  the 
cases  shown  in  Fig.  19  in  which  the  three  aspect  views  are  randomly  selected.  Recognition 
using  multi-aspect  views  may  be  supported  by  biological  vision  systems  in  which  multiple 
perception  fields  are  formed  [19;. 


The  second  approach  for  reducing  the  misrecognition  probability,  which  we  only  mention 
here,  is  to  use  muitisensory  information  for  both  training  and  interrogation.  Polarization  - 
sensitive  sensors  can,  for  example,  be  used  to  measure  the  frequency  response  of  the  target 
for  orthogonal  polarization.  Data  generated  in  this  fashion  can  be  used  for  both  training 
and  interrogating  the  network  to  enhance  the  probability  of  correct  classification. 

Dynamic  Range  and  Noise  Considerations:  One  issue  that  should  be  mentioned 
with  respect  to  neural  networks  concerns  the  dynamic  range  of  input  signals  to  the  network. 
In  applying  neural  networks  to  practical  problems,  it  is  usual  to  use  binary  digital  inputs 
[7]  or  normalized  inputs  [21].  The  range  of  inputs  to  the  network  shown  in  Fig.  15  is  not 
constrained  (i.e.,  it  is  neither  binarized  nor  normalized);  it  is  the  raw  frequency  response  of 
the  target  measured  for  a  given  aspect  corrected  for  range-phase  and  measurement  system 
response  [3],  The  network  can  be  trained  and  tested  with  signals  of  arbitrary  amplitude. 
No  normalization  is  needed  for  preprocessing.  For  example,  this  network,  which  was  trained 
with  a  set  of  aspect  views  with  a  maximum  amplitude  of  0.5  (arbitrary  units)  for  the  B- 
52  airplane,  would  yield  the  same  result  when  interrogated  with  test  sets  of  aspect  views 
of  maximum  amplitudes  of  either  1  or  106  (arbitrary  units).  This  practically  significant 
behavior,  which  we  attribute  to  the  highly  nonlinear  nature  of  tbn  network  (see  equations 
(35)  and  (36)),  indicates  that  there  is  little  constraint  on  the  dynamic  range  of  the  test 
signals  applied  to  the  trained  net. 

A  second  issue  concerns  the  network's  performance  with  noisy  data.  Data  in  our  studv. 
which  were  collected  in  an  experimental  imaging  facility,  had  a  SNR  of  about  15-20  dB. 
The  network  was  also  tested  with  signals  having  a  smaller  SNR  by  adding  to  the  test 
data  artificial  Gaussian  noise  in  accordance  with  the  distribution  shown  in  (32).  This 
situation  was  taken  to  be  a  crude  representation  of  cases  where  the  test  data  are  collected 
under  non-ideal  situations,  such  as  when  vibrations  and  wind  buffeting  against  an  aircraft 
produce  noisy  frequency  response  measurement^.  The  training  data  used  were  still  the 
original  frequency  response  data  collected  in  our  anechoic  chamber  measurement  facility 
with  no  additional  noise  added.  Fig.  19  shows  that  the  network  is  able  to  perform  100(4 
correct  recognition  of  the  three  test  targets  when  ihe  network  was  trained  with  10c/c  of  the 
available  aspect  views  and  tested  with  the  test  set  of  experimental  uata  without  additional 
noise  added.  During  the  training  process,  the  output  was  mapr  -d  from  the  input  as  shown 
in  (36).  When  noise  was  added  to  the  test  set  to  test  the  „etwork  trained  with  4 0 7c  of  the 
aspect  views  of  experimental  data,  the  performance  of  the  network  was  as  given  in  Table 
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Table  2:  Percent  correct  recognition  of  Boeing  747  for  two  different  '.’nines  of  the  threshold 

e. 

2  by  the  row  beginning  with  8  =  0  for  the  Boeing  747  plane.  The  performance  of  the 
network  for  the  other  two  target  models  was  found  to  be  generally  similar  and  is  therefore 
not  shown.  It  is  seen  from  Table  2  that  the  performance  of  the  network  deteriorates  as  SNR 
decreases,  but  the  network  is  still  able  to  furnish  74%  correct  recognition  even  with  SNR=1 
(i.e.  SNR=0  dB).  The  performance  of  the  network  in  the  presence  of  this  severe  noise  case 
can  be  improved  by  changing  the  zero  threshold  in  (36)  to  a  finite  threshold  during  the 
training  process,  and  by  maintaining  the  zero  threshold  during  the  test  or  interrogation 
stage.  In  this  case,  the  output  neuron  state  in  (36)  during  the  training  process  was  replaced 

by, 

{1  for  tanh(u,)  >  8 

(37) 

0  for  tanh(u,)  <  -8 

where  8  represents  the  threshold.  The  output  neuron  state  during  the  test  process  is  still 
given  by  (36)  or  by  8  =  0  in  (37).  The  performance  of  the  network  in  recognizing  the 
Boeing  747  scale  model  for  8  =  0.1  in  (37)  is  shown  in  the  last  row  in  Table  2;  the  network 
was  trained  with  40%  of  the  available  aspect  views  with  no  additional  noise  added.  The 
improvement  in  performance  resulting  from  the  finite  threshold  can  be  readily  noted;  in 
the  low  SNR  range  an  improvement  of  roughly  20  percentage  points  is  achieved.  As  the 
threshold  8  increases,  the  performance  of  the  network  with  respect  to  noisy  data  improves. 
But  in  situations  where  the  noise  is  severe,  such  as  SNR=1,  it  is  hard  to  achieve  perfect 
recognitions,  since  thresholding  becomes  less  effective. 

Effect  of  Spectral  Windows:  All  results  presented  above  are  for  frequency  response 
data  collected  over  6.5-17.5  GHz  band  for  101  points.  A  question  of  practical  importance 
is  whether  fewer  data  points  or  a  narrower  spectral  window  can  be  used  to  facilitate  the 
data  acquisition  process  without  sacrificing  target  identification  ability  by  the  trained  net. 
We  used  several  approaches  to  assess  the  effects  of  spectral  bandwidth  and  the  number  of 


25 


data  points  over  the  band  on  the  performance  of  the  network  in  identifying  the  given  target 
models.  One  way  was  to  keep  the  spectral  band  fixed  at  6.5-17.5  GHz  and  decrease  the 
number  of  data  points  over  the  band;  this  is  equivalent  to  changing  the  sampling  interval  of 
the  frequency  response  data.  In  so  doing,  the  number  of  neurons  in  the  input  layer,  which 
represents  the  number  of  data  points  in  each  measured  frequency  response,  is  decreased 
along  with  the  number  of  hidden  neurons  which  is  equal  to  the  number  of  neurons  in  the 
input  layer.  Another  approach  was  to  keep  the  sampling  interval  unchanged  and  to  choose 
a  portion  of  the  6.5-17.5  GHz  band  as  the  new  spectral  band,  which  again  decreases  the 
number  of  neurons  in  input  layer.  In  this  case,  the  location  of  the  selected  spectral  band  was 
found  to  have  little  effect  on  the  performance  of  the  network.  In  all  of  the  above  cases,  the 
following  behaviors  were  observed:  (a)  When  the  number  of  data  points  and  the  number  of 
neurons  in  the  input  layer  representing  the  input  data  points  to  the  net  is  decreased,  either 
by  changing  the  sampling  interval  or  by  choosing  a  smaller  spectral  band,  the  number  of 
learning  cycles  required  by  the  net  increases;  this  may  be  explained  by  the  fact  that  for 
every  target,  the  amount  of  information  in  the  data  sets  presented  to  the  net  during  training 
is  reduced  as  the  number  of  input  data  points  is  decreased;  thus,  it  takes  relatively  longer 
for  the  net  to  learn  the  underlying  structure  in  the  data  presented  to  it  and  to  form  internal 
representations  of  the  targets,  (b)  When  the  number  of  input  data  points  to  the  net  is 
too  small,  the  net  cannot  learn  or  form  the  internal  represe;  -tions.  The  learning  process 
does  not  converge.  The  minimum  number  of  data  points  for  which  the  learning  process 
diverges  is  17.  the  integer  closest  to  101/6  and  the  factor  by  which  the  sampling  interval 
of  the  frequency  data  over  the  band  6.5-17.5  GHz  was  increased,  (c)  When  the  number  of 
input  data  points  to  the  net  is  decreased,  the  performance  of  the  net  generally  deteriorates; 
the  average  percentage  of  deterioration  is  5%,  with  no  clear  pattern  of  deterioration.  For 
example,  when  the  frequency  band  was  reduced  to  10.5-15.9  GHz,  over  which  there  were 
50  data  points,  and  40%  of  the  available  100  aspect  views  (frequency  responses)  over  this 
band  were  used  for  training  the  net.  the  net’s  performance  in  recognizing  Boeing  747  is 
94%.  This  can  be  compared  with  the  results  shown  in  Fig.  19  in  which  the  net  was  able 
to  achieve  100%  correct  identification  of  the  Boeing  747  when  it  was  trained  with  40%  of 
the  available  views  of  101  data  points  over  the  6.5-17.5  GHz  band  and  tested  with  aspect 
views  over  this  frequency  band.  The  performance  of  the  net  with  narrow  spectral  band  data 
can  be  improved  by  increasing  the  percentage  of  available  aspect  views  used  for  training. 
Wh  n  the  input  frequency  data  to  the  first  layer  of  the  net  consisted  of  50  points  over  the 
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10.5-15.9  GHz  band,  and  the  percentage  of  the  available  aspect  views  used  for  training  the 
net  was  increased  to  50%,  the  performance  of  the  net  in  identifying  the  Boeing  747  model 
was  found  to  improve  to  99%. 

The  divergence  mentioned  in  the  preceding  observation  (b)  occurs  when  the  number 
of  input  data  points  to  the  net,  and  hence  the  number  of  input  layer  neurons  (and  thus 
the  number  of  hidden  neurons,  which  equals  the  number  of  neurons  in  the  input  layer),  is 
too  small.  Theoretical  considerations  of  the  mapping  power  of  multi-layer  networks  [8], [15] 
suggest  that  any  mapping  can  be  accomplished  through  a  network  of  the  type  shown  in  Fig. 
15  provided  that  an  adequate  number  of  hidden  neurons  is  used  (see  cautionary  arguments 
noted  in  epilogue,  [13]).  We  therefore  tested  whether  the  network  can  converge  and  learn  to 
form  internal  representations  of  the  targets  when  the  number  of  input  data  points  was  small 
by  increasing  the  number  of  hidden  neurons  in  the  net.  As  mentioned  earlier,  when  the 
number  of  input  (frequency  response  data)  points  over  the  6.5-17.5  GHz  band  to  the  net  is 
reduced  to  17,  the  learning  process  by  the  net  could  not  converge;  in  this  case,  the  number 
of  the  hidden  neurons  was  also  17.  However,  by  increasing  the  number  of  hidden  neurons  to 
21,  the  net  is  able  to  converge  and  learn  the  internal  representations  for  the  given  aerospace 
target  models.  It  should  be  pointed  out  that,  since  the  Fourier  transform  mapping  between 
the  hidden  and  input  layers  in  the  net  of  Fig.  15  is  carried  out  according  to  the  discrete 
summation  given  in  (34),  the  number  of  hidden  neurons  does  not  have  to  be  equal  to  the 
number  of  input  layer  neurons  (see  also  equation  (7));  this  result  supports  the  theory  in 
[8], [15].  By  increasing  the  number  of  hidden  neurons  further,  the  number  of  learning  cycles 
required  by  the  net  to  converge  during  the  training  process  is  reduced.  Once  there  are 
enough  hidden  neurons  and  the  net  is  able  to  converge  to  learn  the  internal  representations 
for  the  given  aerospace  target  models,  no  clear  improvement  in  performance  is  found,  in 
terms  of  correctly  identifying  the  given  target  models  when  the  number  of  hidden  neurons 
is  increased  further  [21]. 

8  Classification,  Identification  and  Cognition 

The  terms  “target  identification"  and  “target  recognition”  are  frequently  used  interchange¬ 
ably  in  the  literature,  and  we  have  done  the  same  here.  Actually  there  is  an  important 
difference  between  the  two  terms.  The  network  we  have  described  in  the  preceding  section 
is  not  cognitive.  Once  it  has  learned  a  set  of  targets,  it  can  correctly  identify  which  out  of 


the  set  is  responsible  for  the  sensory  signed  (e.g.,  the  complex  frequency  response)  presented 
at  its  input  by  producing  a  correct  identification  label  at  its  output.  The  net  is  robust,  in 
that  noisy  versions  of  its  training  set  data  are  also  correctly  classified  by  triggering  the 
correct  identification  label.  This  robustness  also  provides  for  a  generalization  capability,  in 
that  the  network  is  able  to  classify  correctly  a  data  set  belonging  to  the  learned  object  that 
was  not  specifically  among  the  training  set.  This  capability  to  generalize  means  that  the  net 
does  not  have  to  be  trained  on  all  data  sets  needed  to  represent  the  object  as  dictated  by 
angular  sampling  considerations  (e.g.,  the  scattering  pattern  of  a  target  of  extent  L  must  be 
sampled  approximately  every  X/ L  [radians]  when  A  is  the  mean  wavelength. of  observation). 
Without  proper  precautions,  these  robustness  and  generalization  features  also  mean  that 
every  input  presented  to  the  network  will  produce  a  response  by  triggering  a  label,  even 
when  the  input  belongs  to  a  novel  object,  i.e.,  one  that  was  not  learned  by  the  network. 
The  network  is  therefore  not  cognitive  in  that  it  has  no  mechanism  for  determining  whether 
a  presented  signal  belongs  to  a  familiar  (previously  learned)  object  or  to  a  novel  object. 
Cognitive  capability  is  essential  for  proper  interpretation  and  use  of  a  classifier  network’s 
response,  as  well  as  for  possible  triggering  of  other  useful  mechanisms  like  learning  a  novel 
input  and  adding  it  to  the  repertoire  of  the  net. 

There  are  several  ways  to  impart  cognition  to  a  classification  network.  One  is  to  train 
the  network  on  every  object  it  could  possibly  encounter  in  its  environment  in  the  course  of 
normal  operation.  This  approach  may  not,  however,  be  practical,  as  it  could  require  a  major 
increase  in  the  size  of  the  network,  especially  when  the  number  of  possible  targets  is  very 
large.  A  second  way  to  impart  cognition  is  to  add  at  the  system  sensory  level  detectors  that 
analyze  the  received  signals  to  see  whether  they  belong  to  the  class  of  targets  of  interest. 
Usually,  inference  rules  and  decision  trees  are  needed  to  make  such  distinctions,  and  more 
than  one  sensing  modality  is  often  indicated  (e.g.,  measurement  of  altitude,  speed,  bearing, 
size  (radar  cross  section),  polarization,  etc.).  A  third  way  for  making  a  network  cognitive 
is  to  incorporate  cognitive  capabilities  in  designing  the  net  from  the  outset  [22]. 

9  Discussion 

Extrapolation  and  reconstruction  by  neural  networks  through  learning  were  discussed  in 
the  first  part  of  this  paper.  This  approach  provides  a  novel  way  for  near- perfect  extrapo¬ 
lation  and  reconstruction  from  partial  frequency  response  information.  The  approach  leads 


by  logical  extension  to  the  problem  of  target  identification  using  label  representations  at 
the  output  layer  in  place  of  the  exact  object  functions  reconstructed  in  the  extrapolation 
problem.  The  focus  in  using  neural  networks  for  extrapolation  and  recognition  is  on  the 
structure  of  networks  and  on  the  learning  that  takes  place  in  them,  and  not  on  any  partic¬ 
ular  computation  carried  out  by  a  particular  neuron.  The  number  of  neurons  in  the  hidden 
layer  of  such  networks  need  not  be  equal  to  that  in  the  input  layer,  as  in  most  of  the  nets 
presented  here,  and  can  be  increased  at  will.  The  synaptic  connections  from  the  input  layer 
to  both  the  hidden  and  the  output  layers  need  not  be  fixed,  as  was  the  case  in  this  study,  but 
cam  learn  to  handle  any  reconstruction  problem  in  which  the  available  data  and  the  object 
functions  do  not  necessarily  have  a  Fourier  transform  relation  or  when  the  relation  is  not 
certain  or  known.  In  our  work,  the  measured  frequency  response  data  and  the  object  func¬ 
tion  (the  real  part  of  the  Fourier  inverse  of  the  frequency  response,  i.e.,  the  real  part  of  the 
complex  range  profile  of  the  target)  form  a  Fourier  transform  pair.  For  practical  application 
of  the  target  identification  concept  presented  in  this  paper,  one  envisions  that  a  library  of 
frequency  responses  of  scale  models  of  targets  of  interest  can  be  generated  by  measurements 
under  controlled  conditions  in  an  anechoic  chamber  radar  scattering  measurement  facility 
for  all  target  aspects  relevant  to  practical  encounter  scenarios  between  a  radar  system  and 
the  target.  Data  generated  in  this  fashion  would  be  “taught”  to  a  layered  net  by  training 
as  we  have  described.  To  use  such  “trained  nets”  to  identify  actual  radar  targets  (that  cor¬ 
respond  to  the  scale  models  used)  from  data  generated  by  broad-band  radar  systems  in  the 
field,  attention  to  scaling  issues  would  be  given  by  invoking  the  principle  of  electromagnetic 
similitude  [20].  In  this  fashion,  one  hopes  to  avoid  the  tedious  and  costly  task  of  forming 
libraries  in  the  field  using  actual  radar  systems  and  cooperative  target  “fly-bys”. 

The  number  of  neurons  in  the  input  layer  of  our  learning  networks  is  determined  by 
the  number  of  available  frequency  samples.  The  relation  between  the  number  of  functions 
that  can  be  learned  by  the  network  am1  the  number  of  neurons  in  the  h'dden  layer  is  still 
an  open  question;  however,  the  theoretically  established  claim  for  the  mapping  power  of 
multi-layer  neuron  networks  [8],  [15]  taken  together  with  the  findings  of  this  work,  provide 
strong  evidence  in  support  of  the  use  of  layered  networks  for  target  recognition.  Nonlinear 
mappings  in  layered  networks  enable  the  formation  of  the  desired  reconstruction  mapping 
region  [15]  to  give  robust  reconstructions  from  partial  and  noisy  frequency  information.  The 
application  of  these  concepts  to  the  problem  of  noncooperative  radar  target  identification 
provides  convincing  evidence  of  the  capability  of  neuromorphic  processing  in  providing 
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results  not  attainable  by  traditional  signal  processing  techniques. 
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(c) 


Figure  1:  Microwave  images  reconstructed  by  DFT  (a)  for  spectral  bandwidth  6-17  GHz 
and  (b)  for  spectral  bandwidth  2-26.5  GHz;  (c)  image  reconstructed  by  nonlinear  neural 
net  for  the  6-17  GHz  spectral  bandwidth  data. 
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Figure  3:  Network  for  XOR  mapping. 
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amplitude  amplitude 


Figure  5:  Two  test  object  patterns  o(r)  used  in  simulations:  (a)  first  pattern:  (b)  second 
pattern. 
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imaginary  part  real  part 

-0.4  -0.1  0.1  0.4  0.7  -0.4  -0.1  0. 


(a) 


(b) 

Figured:  Frequency  responses  for  the  first  object  (solid  line)  and  the  second  object  (dotted 
lined):  (a)  real  part:  (b)  imaginary  part. 
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Figure  7:  Reconstruction  of  the  first  object  pattern  by  DFT:  (a)  real  part;  (b)  intensity. 
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output  for  the  tat  pattern  output  for  the  2nd  pattern  output  for  the  \»t  pattern  output  for  the  2ruf  pattern 

(a)  1j(  pattern  ia  uaed  for  training  the  net  in  the  tat  learning  cycle.  (*)  lst  ia  used  for  training  the  net  in  the  3rd  learning  cycle. 


Figure  8:  Sequence  shows  how  the  network  gradually  learns  the  synaptic  connections  to  provide  eventually  the 
correct  output  for  two  patterns  from  partial  associated  frequency  responses  presented  at  its  input. 


range  in  centimeter 


(b) 

Figure  9:  Object  patterns  with  more  complex  shapes  used  in  simulation;  (a)  first  pattern; 
(b)  second  pattern. 
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(b) 


Figure  10:  Reconstructions  of  the  complex-shaped  patterns  of  Fig.  9:  (a)  first  pattern:  (b) 
second  pattern. 
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<b) 

Figure  12:  Reconstructions  from  noise-contaminated  frequency  responses  of  Fig.  11:  (a) 
first  pattern:  (b)  second  pattern. 
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imaginary  part  real  part 


Figure  13:  Noise-contaminated  frequency  responses  (SNR=  1 )  of  the  first  pattern  (solid  line) 
and  the  second  pattern  (dotted  line)  of  Fig.  5:  (a)  real  part;  (b)  imaginary  part. 
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2nd  pattern  1st  pattern 


Figure  14:  Reconstruction  from  the  noisy  data  (SNR=5)  of  Fig.  11  after  the  network  has 
been  trained  with  instances  of  the  noisy  data  (SNR  =  1)  of  Fig.  13  and  the  noise  free  data 
of  Fig.  6:  (a)  first  pattern;  (b)  second  pattern. 
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Figure  15:  Neural  network  for  target  recognitions. 
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percentage  of  available  aspect  views  of  each  target  included  in  learning  set 


Figure  17:  Correct  recognition  from  single  echo  or  “look"  vs.  size  of  training  set  for  the 
B-52  (solid  line;  and  for  the  Boeing  747  (dashed  line). 
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percentage  of  available  aspect  views  of  each  target  included  in  learning  set 


Figure  18:  Correct  recognitions  vs.  the  size  of  training  set  when  “a  two  out  of  three" 
majority  vote  criterion  is  used  for  correct  classification,  for  the  B-52  (solid  line)  and  for  the 
Boeing  747  (dashed  line). 
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percentage  of  available  aspect  views  of  each  target  included  in  learning  set 


Figure  19:  Correct  recognitions  vs.  the  size  of  training  set  when  "a  two  out  of  three 
majority  vote  criterion  is  used  for  correct  classification  for  the  B-52  (solid  line),  Boeing  74 
(dashed  line),  and  Space  shuttle  (dotted  line). 
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do  no!  belong  to  the  stored  object  information.  Simulations  were  also  rained  out 
to  verify  that  several  distinct  terminal  attractors  with  unique  basins  of  attraction 
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Optoelectronic  Neural  Networks  and 
Learning  Machines 

Nabil  H.  Farhat 


Foreword 

Circuits  and  Devices  Magazine  is  featuring  three  sequen¬ 
tial  articles  on  the  current  status  of  artificial  neural  network 
implementation  technol¬ 
ogy.  The  current  ottering, 
on  optronic  implementa¬ 
tion  of  artificial  neural  net¬ 
works,  is  the  second  entry 
in  this  trilogy.  It  is  sand¬ 
wiched  between  the  pre¬ 
vious  overview  on  analog 
implementation  and  the 
upcoming  survev  ot  digital 
artificial  neural  networks. 

Nabil  H.  Farhat,  who 
penned  this  overview,  is  a 
co-author  of  the  1985  arti¬ 
cle  in  Optics  Letters  and 
follow-up  paper  in  Applied  Optics  that  broke  ground  for 
modern  optical  implementation  of  artificial  neural  net¬ 
works. 

Robert  /.  Murks  II 


Abstract 

Optics  otters  advantages  ui  realizing  the  parallelism,  massive  intercon- 
nectivity.  amt  plastiatu  required  in  the  design  and  construction  ot  large- 
scale  optoelectronic  Iphotonici  neiiroconipnters  that  solve  optimization 
problems  at  potentially  very  high  speeds  by  learning  to  perform  mappings 
and  associations.  To  elucidate  these  advantages,  a  hr  let  neural  net  primer 
based  on  phase-space  and  energy  landscape  considerations  is  first  pre¬ 
sented.  This  provides  the  basis  tor  subsequent  discussion  of  optoelectronic 
architectures  and  implementations  with  selt-organization  and  learning  ability 
that  are  configured  around  an  optical  crossbar  interconnect.  Stochastic 
learning  in  the  context  of  a  Boltzmann  machine  is  then  described  to  illus¬ 
trate  the  flexibility  of  optoelectronics  in  performing  tasks  that  may  be 
difficult  tor  electronics  alone.  Stochastic  nets  ore  studied  to  gao.  insight 
into  the  possible  role  ot  noise  in  biological  neural  nets.  t\V  close  by  de¬ 
scribing  two  approaches  to  realizing  large-scale  optclectromc  nciiroconi- 
puters:  integrated  optoelectronic  neural  chips  with  interchip  optical 
interconnects  that  enables  tlicir  clustering  into  large  neural  networks,  and 
nets  Unlit  two-dimensional  rather  than  one-dimeiisional  arrangement  at 
neurons  and  tour-dimensional  connectivity  matrices  tor  increased  packing 
density  and  compatibility  with  two-dimensional  data.  IVe  foresee  inte¬ 
grated  optoelectronics  or  photonics  playing  an  increasing  role  in  the  con¬ 
struction  o>  a  new  generation  ot  versatile  programmable  analog  computers 
that  perform  computations  collectively  tar  use  in  neuroinorphic  (brain- 
like)  processing  and  fast  simulation  and  study  ot  complex  nonlinear  dy¬ 
namical  systems. 

Introduction 

Neural  net  models  and  their  analogs  offer  a  brain-like 
approach  to  information  processing  and  representation  that 
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is  distributed,  nonlinear  and  iterative.  Therefore  thev  are 
best  described  in  terms  of  phase-space  behavior  where  one 
can  draw  upon  a  nch  background  of  theoretical  results  de¬ 
veloped  in  the  field  of  nonlinear  dynamical  systems.  The 
ultimate  purpose  of  biological  neural  nets  (BNNs)  is  to  sus¬ 
tain  and  enhance  survivability  of  the  organism  thev  reside 
in,  doing  so  in  an  imprecise  and  usually  very  complex  en¬ 
vironment  where  sensorv  impressions  are  at  best  sketch\ 
and  difficult  to  make  sense  of  had  they  been  treated  and 
analyzed  by  conventional  means.  Embedding  artificial  neural 
nets  (ANNs)  in  man-made  systems  endows  them  therefore 
with  enhanced  survivability  through  fault-tolerance,  ro¬ 
bustness  and  speed.  Furthermore,  survivability  implies 
adaptability  through  self-organization,  knowledge  accu¬ 
mulation  and  learning.  It  also  implies  lethality. 

All  of  these  are  concepts  found  at  play  in  a  wide  range 
of  disciplines  such  as  economics,  social  science,  and  even 
military  science  which  can  perhaps  explain  the  widespread 
interest  in  neural  nets  exhibited  today  from  both  intellec¬ 
tual  and  technological  viewpoints.  It  is  widely  believed  that 
artificial  neurocomputing  and  knowledge  processing  sys¬ 
tems  could  eventually  have  significant  impact  on  infor¬ 
mation  processing,  pattern  recognition,  and  control. 
However,  to  realize  the  potential  advantages  of  neuro- 
morphic  processing,  one  must  contend  with  the  issue  of 
how  to  carry  out  collective  neural  computation  algorithms 
at  speeds  far  beyond  those  possible  with  digital  computing. 
Obviously  parallelism  and  concurrency  are  essential  ingre¬ 
dients  and  one  must  contend  with  basic  implementation 
issues  of  how  to  achieve  such  massive  connectivity  and 
parallelism  and  how  to  achieve  artificial  plasticity,  i.e., 
adaptive  modification  of  the  strength  of  interconnections 
(synaptic  weights)  between  neurons  that  is  needed  tor 
memory  and  self-programming  (self-organization  and 
learning).  The  answers  to  these  questions  seem  to  be  com¬ 
ing  from  two  directions  of  research.  One  is  connection  ma¬ 
chines  in  which  a  large  number  of  digital  central  processing 
units  are  interconnected  to  perform  parallel  computations 
in  VLSI  hardware;  the  other  is  analog  hardware  where  a 
large  number  of  simple  processing  units  (neurons)  are  con¬ 
nected  through  modifiable  weights  such  that  their  phase- 
space  dynamic  behavior  ha  -  useful  signal  processing  func¬ 
tions  associated  with  it. 

Analog  optoelectronic  hardware  implementation  of  neural 
nets  (see  Farhat  et  al.  in  list  of  further  reading),  since  first 
introduced  in  1985,  has  been  the  focus  of  attention  tor  sev¬ 
eral  reasons.  Primary  among  these  is  that  the  optoelectronic 
or  photonic  approach  combines  the  best  of  two  worlds;  the 
massive  interconnectivity  and  parallelism  of  optics  and  the 
flexibility,  high  gain,  and  decision  making  capability  (non¬ 
linearity)  offered  by  electronics.  Ultimately,  it  seems  more 
attractive  to  form  analog  neural  hardware  by  completely 
optical  means  where  switching  of  signals  from  optical  to 
electronic  earners  and  vice  versa  is  avoided.  However,  in 
the  absence  of  suitable  fully  optical  decision  making  devices 
(e.g.,  sensitive  optical  bistability  devices),  the  capabilities 
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of  the  optoelectronic  approach  remain  quite  attractive  and 
could  in  tact  remain  competitive  with  other  approaches  when 
one  considers  the  flexibility  of  architectures  possible  with 
it.*  In  this  paper  we  concentrate  therefore  on  the  optoelec¬ 
tronic  approach  and  give  selected  examples  of  possible  ar¬ 
chitectures,  methodologies  and  capabilities  aimed  at 
providing  an  appreciation  of  its  potential  in  building  a  new 
generation  of  programmable  analog  computers  suitable  for 
the  study  ot  non-linear  dvnamical  systems  and  the  imple¬ 
mentation  of  mappings,  associative  memory,  learning,  and 
optimization  functions  at  potentially  verv  high  speed. 

We  begin  with  a  brief  neural  net  primer  that  emphasizes 
phase-space  description,  then  focus  attention  on  the  role 
of  optoelectronics  in  achieving  massive  interconnectivity 
and  plasticity.  Architectures,  methodologies,  and  suitable 
technologies  for  realizing  optoelectronic  neural  nets  based 
on  optical  crossbar  (matrix  vector  multiplier)  configurations 
for  associative  memory  function  are  then  discussed.  \ext, 
partitioning  an  optoelectronic  analog  of  a  neural  net  into 
distinct  layers  with  a  prescribed  interconnectivity  pattern 
as  a  prerequisite  for  self-organization  and  learning  is  dis¬ 
cussed.  Here  the  emphasis  will  be  on  stochastic  learning 
by  simulated  annealing  in  a  Boltzmann  machine.  Stochastic 
learning  is  of  interest  because  of  its  relevance  to  the  role  of 
noise  in  biological  neural  nets  and  because  it  provides  an 
example  of  a  task  that  demonstrates  the  versatility  of  optics. 
We  close  by  describing  several  approaches  to  realizing  the 
large-scale  networks  that  would  be  required  in  analog  so¬ 
lution  of  practical  problems. 


Neural  Nets  — A  Brief  Overview 

In  this  section,  a  brief  qualitative  description  of  neural 
net  properties  is  given.  The  emphasis  is  on  energy  land¬ 
scape  and  phase-space  representations  and  behavior.  The 
descriptive  approach  adopted  is  ]udged  best  as  background 
for  appreciating  the  material  in  subsequent  sections  with¬ 
out  having  to  get  involved  in  elaborate  mathematical  ex¬ 
position.  nil  neurai  net  properties  described  here  are  well 
known  and  can  easily  be  found  in  the  literature.  The  view¬ 
point  of  relating  all  neural  net  properties  to  energy  land¬ 
scape  and  phase-space  behavior  is  also  important  and  useful 
in  their  classification. 

A  neural  net  of  N  neurons  has  (N:-N)  interconnections 
or  (N’;-N)/2  svmmetnc  interconnections,  assuming  that  a 
neuron  does  not  communicate  with  itself.  The  state  of  a 
neuron  in  the  net,  i.e.,  its  firing  rate,  can  be  taken  to  be 
binary  (0,  1)  (on-off,  firing  or  not  firing)  or  smoothly  vary¬ 
ing  according  to  a  nonlinear  continuous  monotonic  func¬ 
tion  often  taken  as  a  sigmoidal  function  bounded  from  above 


'It  is  worth  mentioning  here  that  recent  results  obtained  in  our 
work  show  that  networks  ot  logistic  neurons,  whose  response  re¬ 
sembles  that  of  the  denvattve  ot  a  sigmoidal  function,  exhibit  rich 
and  interesting  dynamics,  including  spurious  state-tree  associative 
recall,  and  allow  the  use  of  unipolar  svnaptic  weights.  The  net¬ 
works  can  be  realized  in  a  large  number  of  neurons  when  imple¬ 
mented  with  optically  addressed  reflectior-tvpe  liquid  crystal  spatial 
light  modulators.  However,  the  flexibility  of  such  an  approach 
versus  that  of  the  photonic  approach  is  vet  to  be  determined. 

’’From  here  on  it  will  be  taken  as  understood  that  whenever  the 
subscripts  li  or  |)  appear,  they  run  from  1  up  to  N  where  N  is  the 
number  of  neurons  in  the  net. 


and  below.  Thus  the  state  of  the  l-th  neuron  in  the  net  can 
be  described  mathematically  by 

s,  =  f{u,}  ;  =  1,  2,  3  .  .  .N**  (1) 

w'here  f{.}  is  a  sigmoidal  function  and 

\ 

u.  =  2  w..s,  -  0,  +  1,  (2) 

is  the  activation  potential  of  the  i-th  neuron,  W„  is  the 
strength  or  weight  ot  the  synaptic  interconnection  between 
the  j-th  neuron  and  the  i-th  neuron,  and-W„  =  0(i.e.,  neu¬ 
rons  do  not  talk  to  themselves).  0,  and  1,  are,  respectively, 
the  threshold  level  and  external  or  control  input  to  the  i-th 
neuron,  thus  W„S,  represents  the  input  to  neuron  i  from 
neuron  j  and  the  first  term  on  the  right  side  of  (2)  represents 
the  sum  of  all  such  inputs  to  the  i-th  neuron.  For  excitatory 
interconnections  or  synapses,  W'„  is  positive,  and  it  is  neg¬ 
ative  for  inhibitory  ones.  For  a  binary  neural  net,  that  is, 
one  in  which  the  nurons  are  binary,  i.e.,  s,[0, 1 ),  the  smoothly 
varying  function  f{.)  is  replaced  by  U{.},  where  U  is  the  unit 
step  function.  When  W  is  symmetric,  i.e.,  W„  =  W„  one 
can  define  (see  I.  J.  Hoptield's  article  in  list  of  further  read¬ 
ing)  a  Hamiltonian  or  energy  function  E  for  the  net  by 

E  =  -  J  2  u.s. 

=  -  \  2  2  w.,s,s,  -  \  £  (0.  -  l.)s,  (3) 

The  energy  is  thus  determined  by  the  connectivity  matrix 
W„,  the  threshold  level  9,  and  the  external  input  I,.  For 
symmetric  W„  the  net  is  stable;  that  is,  for  any  threshold 
level  9,  and  given  strobed''  (momentarily  applied)  input 
l„  the  energy  of  the  net  will  be  a  decreasing  function  of  the 
neurons  state  s,  of  the  net  or  a  constant.  This  means  that 
the  net  always  heads  to  a  steady  state  of  local  or  global 
energy'  minimum.  The  descent  to  an  energy'  minimum  takes 
place  by  the  iterative  discrete  dynamical  process  described 
by  Eqs.  (1)  and  (2)  regardless  of  whether  the  state  update 
of  the  neurons  is  synchronous  or  asynchronous.  The  min¬ 
imum  can  be  local  or  global,  as  the  "energy  landscape"  of 
a  net  (a  visualization  of  E  for  every  state  s,)  is  not  monotonic 
but  will  possess  many  uneven  hills  and  troughs  and  is 
therefore  characterized  by  many  local  minima  of  various 
depths  and  one  global  (deepest)  minimum.  The  energy 
landscape  can  therefore  be  modified  in  accordance  with  Eq. 
(3)  bv  changing  the  interconnection  weights  W„  and  or  the 
threshold  levels  9  and/or  the  external  input  I,.  This  ability 
to  "sculpt"  the  energy  landscape  of  the  net  provides  for 
almost  all  the  rich  and  fascinating  behavior  of  neural  nets 
and  for  the  ongoing  efforts  of  harnessing  these  properties 
to  perform  sophisticated  spatio-temporal  mappings,  com¬ 
putations,  and  control  functions.  Recipes  exist  that  show 
how  to  compute  the  W,  matrix  to  make  the  local  energy 
minima  correspond  to  specific  desired  states  of  the  net¬ 
work.  As  the  energy  minima  are  stable  states,  the  net  tends 
»o  settle  in  one  of  them,  depending  o..  the  initializing  state, 
when  strobed  by  a  given  input.  For  example,  a  binary  net 
of  N  =  3  neurons  will  have  a  total  of  2N  =  8  states.  These  are 
listed  in  Table  1.  They  represent  all  possible  combinations 
s,,  s;  and  s,  of  the  three  neurons  that  describe  the  state 
vector  s  =  (s,,s;,s,|  of  the  net.  For  a  net  of  N  neurons  the 
state  vector  is  N-dimensional.  For  N  =  3  the  state  vector  can 
be  represented  as  a  point  (tip  of  a  position  vector)  in  3-D 
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F.'V  I  Phase-space  or  state  space  representation  ami  traicctones  'or  a 
neural  net  pt  S  =  ^  neurons,  tat  tor  binary  neurons ,  lb)  tor  neurons  tetth 
normalized  smooth  (sigmoidal)  response 


space.  The  eight  state  vectors  listed  in  Table  1  tall  then  on 
the  vertices  ot  a  unit  cube  as  illustrated  in  Fig.  1(a).  As  the 
net  changes  its  state,  the  tip  ot  the  state  vector  lumps  from 
vertex  to  vertex  describing  a  discrete  traiectorv  as  depicted 
by  the  broken  traiectorv  starting  trom  the  tip  ot  the  initial¬ 
izing  state  vector  s,  and  ending  at  the  tip  ot  the  tinal  state 
vector  s,.  For  any  symmetric  connectivitv  matrix  assumed 
for  the  three-neuron  net  example,  each  ot  the  eight  states 
in  Table  1  yields  a  value  ot  the  energy  E.  A  listing  ot  these 
values  tor  each  state  represents  the  energy  landscape  of  the 
net. 

For  a  nonbinarv  neural  net  whose  neurons  have  nor¬ 
malized  sigmoidal  response  s,e[0,1  ],i.e.,  s.  varies  smoothly 
between  zero  and  one,  the  phase-space  traiectorv  is  con¬ 
tinuous  and  is  always  contained  within  the  unit  cube  as 
illustrated  in  Fig.  1(b).  The  neural  net  is  governed  then  bv 
a  set  of  continuous  differential  equations  rather  than  the 
discrete  update  relations  of  Eqs.  (1)  and  (2).  Thus  one  can 
talk  of  nets  with  either  discrete  or  continuous  dynamics. 
The  above  phase-space  representation  is  extendable  to  a 
neural  net  of  N  neurons  where  one  considers  discrete  tra- 
[ectories  between  the  vertices  ot  a  unit  hvpercube  in  \- 
dimensional  space  or  a  smooth  traiectorv  confined  within 
the  unit  hypercube  for  discrete  and  continuous  neural  nets, 
respectively. 

The  stable  states  of  the  net,  described  before  as  minima 
of  the  energy  landscape,  correspond  to  points  in  the  phase- 
space  towards  which  the  state  ot  the  net  tends  to  evolve  in 


Table  i  Possible  States  ot  a  Binary  Xeural  Xel  o'  3  .Neu¬ 
rons 


S, 

Si 

S, 

0 

0 

0 

0 

0 

1 

0 

1 

0 

1 

0 

0 

0 

0 

1 

1 

0 

1 

1 

1 

0 

1 

1 

1 

time  when  the  net  is  iterated  from  an  arbitrary  initial  state 
Such  stable  points  are  called  "attractors  '  or  "limit  point*, 
ot  the  net,  *o  borrow  trom  terms  used  in  the  description  ■! 
nonlinear  dynamical  systems.  Attractors  in  phase-space  are 
characterized  bv  basins  ot  attraction  ot  given  size  and  shain 
initializing  the  net  trom  a  state  tailing  within  the  basin  ei 
attraction  ot  a  given  attractor  and  thus  regarded  as  an  in¬ 
complete  or  noisy  version  ot  the  attractor,  leads  to  a  tra¬ 
iectorv  that  converges  to  that  attractor.  This  is  a  manv  to 
one  mapping  or  an  associative  search  operation  that  ieag' 
to  an  associative  memorv  attribute  ot  neural  nets. 

Local  minima  in  an  energy  landscape  or  attractors  in  pna-t 
space  can  be  fixed  bv  forming  VV  in  accordance  with  the 
Hebbian  learning  rule  (see  both  Hebb  and  Floptield  in  list 
or  further  reading),  i.e.,  by  taking  the  sum  ot  the  outer 
products  ot  the  bipolar  versions  ot  the  state  vector  we  w  isn 
to  store  in  the  net 

M 

W„  =  ^  v;"”  y  ""  >4) 
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where 


Fig.  2  Conceptual  representation  ot  energy  landscape. 
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are  M  bipolar  binary  N-vectors  we  wish  to  store  in  the  net. 
Provided  that  s,lml  are  uncorrelated  and 
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the  M  stored  state  s"'"  will  become  attractors  in  phase-space 
of  the  net  or  equivalently  their  associated  energies  will  be 
local  minima  in  the  energy  landscape  of  the  net  as  illus¬ 
trated  conceptually  in  Fig.  2.  As  M  increases  beyond  the 
value  given  by  (6),  the  memory  is  overloaded,  spurious 
local  minima  are  created  in  addition  to  the  desired  ones 
and  the  probability  of  correct  recall  from  partial  or  noise 
information  deteriorates,  compromising  operation  or  the 
net  as  an  associative  memory  (see  R.J.  McEliece  et  al.  in 
list  of  further  reading). 

The  net  can  also  be  formed  in  such  a  way  as  to  lead  to  a 
hetero-associative  storage  and  recall  function  by  setting  the 
interconnection  weights  in  accordance  with 

V.„  =  V  vi""  g;n"  (7) 


where  v,ml  and  g""’  are  associated  N-vectors.  Networks  ot 
this  variety  can  be  used  as  feedforward  networks  only  and 
this  precludes  the  rich  dynamics  encountered  in  feedback 
or  recurrent  networks  from  being  observed.  Nevertheless, 
they  are  useful  for  simple  mapping  and  representation. 
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Energy  landscape  considerations  are  useful  in  devising 
formulas  for  the  storage  of  sequences  of  associations  or  a 
cyclic  sequence  of  associations  as  would  be  required  tor 
conducting  sequential  or  cyclic  searches  of  memories. 

Learning  in  biological  neural  nets  is  thought  to  occur  bv 
self-organization  where  the  synaptic  weights  are  modified 
electrochemically  as  a  result  of  environmental  (sensory  and 
other  (e.g.,  contextual))  inputs.  All  such  learning  requires 
plasticity,  the  process  of  gradual  svnaptic  modification. 
Adaptive  learning  algorithms  can  be  deterministic  or  sto¬ 
chastic:  supervised  or  unsupervised.  Ail  optoelectronic 
i  Boltzmann  machine)  and  its  learning  performance  w  ill  be 
described  in  'he  section  on  large  scale  networks  as  an  il¬ 
lustration  of  the  unique  capabilities  ot  optoelectronic  hard¬ 
ware. 

Neural  Nets  Classification  and  Useful 
Functions 

The  energy  function  and  energy  landscape  description 
ot  the  behavior  of  neural  networks  presented  in  the  pre¬ 
ceding  sections  allows  their  classification  into  three  groups. 
For  one  group  the  local  minima  in  the  energy  landscape 
are  what  counts  jn  the  network's  operation.  In  the  second 
group  the  local  minima  are  not  utilized  and  only  the  global 
minimum  is  meaningful.  In  the  third  group  the  operations 
involved  do  not  require  energy  considerations.  They  are 
merely  used  for  mapping  and  reduction  of  dimensionality. 
The  first  group  includes  Hopfield-tvpe  nets  for  all  types  of 
associative  memory  applications  that  include  auto-associ- 
ative,  hetero-associative,  sequential  and  cyclic  data  storage 
and  recall.  This  category  also  includes  all  self-organizing 
and  learning  networks  regardless  of  whether  the  learning 
in  them  is  supervised,  unsupervised,  deterministic,  or  sto¬ 
chastic  as  the  ultimate  result  of  the  fact  that  learning,  whether 
hard  or  soft,  can  be  interpreted  as  shaping  the  energy  land¬ 
scape  of  the  net  so  as  to  "dig"  in  it  valleys  corresponding 
to  learned  states  of  the  network.  All  nets  in  this  category 
are  capable  ot  generalization.  An  input  that  was  not  learned 
specifically  but  is  within  a  prescribed  Hamming  distance' 
to  one  of  the  entities  learned  would  elicit,  in  the  absence 
of  any  contradictory  information,  an  output  that  is  close  to 
the  outputs  evoked  when  the  learned  entity  is  applied  to 
the  net.  Because  of  the  multilayered  and  partially  intercon¬ 
nected  nature  of  self-organizing  networks,  one  can  define 
input  and  output  groups  of  neurons  that  can  be  of  unequal 
number  (See  section  on  large  scale  networks).  This  is  in 
contrast  to  Hopfield-tvpe  nets  which  are  fully  intercon¬ 
nected  and  therefore  the  number  of  input  and  output  neu¬ 
rons  is  the  same  (the  same  neurons  define  the  initial  and 
final  states  of  the  net).  The  ability  to  define  input  and  out¬ 
put  groups  of  neurons  in  multilayered  nets  enables  addi¬ 
tional  capabilities  that  include  learning,  coding,  mapping, 
and  reduction  of  dimensionality. 

The  second  group  of  neural  nets  includes  nets  that  per¬ 
form  calculations  that  require  finding  the  global  energy 
minimum  of  the  net.  The  need  for  this  type  of  calculation 

‘The  Hamming  distance  between  two  binary  N'-vectors  is  the 
number  of  elements  in  which  they  differ. 

"A  chaotic  attractor  is  manifested  by  a  phase-space  traiectorv 
that  is  completely  unpredictable  and  is  highly  sensitive  to  initial 
conditions.  It  could  ultimately  turn  out  to  plav  a  role  in  cognition. 


often  occurs  in  combinatorial  optimization  problems  and  in 
the  solution  of  inverse  problems  encountered,  tor  example, 
in  vision,  remote  sensing,  and  control. 

The  third  group  of  neural  nets  is  multilayered  with  lo¬ 
calized  nonglobal  connections  similar  to  those  in  cellular 
automata  where  each  neuron  communicates  within  its  las  er 
with  a  pattern  of  neurons  in  its  neighborhood  and  with  a 
pattern  of  neurons  in  the  next  adjacent  layer.  Multilayered 
nets  with  such  localized  connections  can  be  used  for  map¬ 
ping  and  feature  extraction.  Neural  nets  can  also  be  cate¬ 
gorized  by  whether  they  are  single  layered  or  multilavered, 
self-organizing  or  nonseif-organizing,  solely  feedforward 
or  involve  feedback,  stochastic  or  deterministic.  However, 
the  most  general  categorization  appears  to  be  in  terms  ot 
the  way  the  energy  landscape  is  utilized,  or  in  terms  ot  the 
kind  of  attractors  formed  and  utilized  in  its  phase-space 
(limit  points,  limit  cy  cles,  or  chaotic”). 


Implementations 

The  es-liest  optoelectronic  neurocomputer  was  of  the  fully 
interconnected  variety  where  all  neurons  could  talk  to  each 
other.  It  made  use  of  incoherent  light  to  avoid  interference 
effects  and  speckle  noise  and  also  relax  the  stringent  align¬ 
ment  required  in  coherent  light  systems.  An  optical  cross¬ 
bar  interconnect  (see  Fig.  3)  was  employed  to  carry  out  the 
vector  matrix  multiplication  operation  required  in  the  sum¬ 
mation  term  in  Eq.  2.  (see  Farhat  et  al.  (1985)  in  list  ot 
further  reading).  In  this  arrangement  the  state  vector  of  the 
net  is  represented  by  the  linear  light  emitting  array  (LEA) 
or  equivalently  by  a  linear  array  of  light  modulating  ele¬ 
ments  of  a  spatial  light  modulator  (SLM),  the  connectivity 
matrix  W„  is  implemented  in  a  photographic  transparency- 
mask  (or  a  2-D  SLM  when  a  modifiable  connectivity  mask 
is  needed  for  adaptive  learning),  and  the  activation  poten¬ 
tial  u,  is  measured  with  a  photodiode  array  (PDA).  Light 
from  the  LEA  is  smeared  vertically  onto  the  VV„  mask  with 
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Fig.  F  Boltzmann  learning  machine,  (a)  optoelectronic  circuit  diagram 
of  a  net  partitioned  into  three  lauers  bv  blocking  segments  of  the  intcrcon- 
iiectii’itu  mask,  lb)  hardware  implementation  showing  the  slate  vector 
LF.0  nrrau  at  the  top  right,  the  MOSLM  at  the  center  (between  tenses) 
and  an  intensified  PDA  (PDA  abutted  to  an  image  intensitier  fiber  output 
window  tor  added  gain)  in  the  lower  lett.  The  integrated  circuit  tvard  rack 
contains  the  MOSLM  driver  and  computer  interlace  and  the  TV  receiver 
in  the  background  provides  the  "snow  pattern  "  that  is  imaged  through  a 
slit  onto  the  intensitier  input  window  for  optical  infection  ot  noise  in  the 
network. 


the  aid  of  an  anamorphic  lens  system  (cylindrical  and 
spherical  lenses  in  tandem  not  shown  in  the  figure  for  sim¬ 
plicity).  Light  passing  through  rows  of  W„  is  focused  onto 
the  PDA  elements  by  another  anamorphic  lens  system.  To 
realize  bipolar  transmission  values  in  incoherent  light,  pos¬ 
itive  elements  and  negative  elements  of  any  row  of  WM  are 
assigned  to  two  separate  subrows  of  the  mask  and  light 
passing  through  each  subrow  is  focused  onto  ad|acent  pairs 
of  photosites  ot  the  PDA  whose  outputs  are  subtracted.  In 
Fig.  3,  both  the  neuron  threshold  H,  and  external  input  I, 
are  injected  optically  with  the  aid  of  a  pair  of  LEAs  whose 
light  is  focused  on  the  PDA.  Note  that  positive  valued  1,  is 
assumed  here  and  therefore  its  LEA  elements  are  shown 
positioned  to  focus  onto  positive  photosites  of  the  PDA 
only. 

This  architecture  was  successfully  employed  in  the  first 
implementation  of  a  32  neuron  net  (see  Farhat  et  al.  (1985) 
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in  list  of  further  reading).  Fig.  3  also  shows  a  third  LEA  tor 
injection  ot  spatio-temporal  noise  into  the  net  as  would  be 
required,  tor  example,  in  the  implementation  ot  a  noisv 
threshold  scheme  for  the  Boltzmann  learning  machine  to 
be  discussed  later.  The  net  of  Fig.  3  behaved  as  an  associ¬ 
ative  memory  very  much  as  expected  and  was  found  to 
exhibit  correct  recovery  of  three  neurons  stored  from  partial 
information  and  showed  robustness  with  element  tailure 
(two  ot  its  32  neurons  were  accidentally  disabled,  2  PDA 
elements  broke,  and  no  noticeable  degradation  m  perform¬ 
ance  was  observed). 

In  the  arrangement  of  Fig.  3,  the  neurons  are  tullv  inter¬ 
connected.  To  implement  learning  in  a  neural  net,  one  needs 
to  impart  structure  to  the  net,  i.e.,  be  able  to  partition  the 
net  into  distinct  input,  output,  and  hidden  groups  or  las  ers 
ot  neurons  with  a  prescribed  pattern  of  communication  or 
interconnections  between  them  which  is  not  possible  in  ,i 
fully  interconnected  or  single  layer  network.  A  .simple  but 
effective  wav  of  partitioning  a  fully  interconnected  opto¬ 
electronic  net  into  several  layers  to  form  a  partially  inter¬ 
connected  net  is  shown  in  Fig.  4(a).  This  is  done  simple  bv 
blocking  certain  portions  of  the  VV,,  matrix. 

In  the  example  shown,  the  blocked  submatrices  serxe  to 
prevent  neurons  from  the  input  group  \\  and  the  output 
group  V.  from  talking  to  each  other  directly.  Thev  can  do 
so  only  via  the  hidden  or  buffer  group  ot  neurons  H.  I  ur- 
thermore,  neurons  within  H  can  not  talk  to  each  other.  Tin-, 
partition  scheme  enables  arbitrary  division  ot  neurons  among 
layers  and  can  be  rapidly  set  when  a  programmable  non¬ 
volatile  SLM  under  computer  control  is  used  to  implement 
the  connectivity  weights.  Neurons  in  the  input  and  output 
groups  are  called  visible  neurons  because  thev  interface 
with  the  environment. 

The  architecture  ot  Fig.  4  can  be  used  in  supervised  learn¬ 
ing  where,  beginning  from  an  arbitrary  WM,  the  net  is  pre¬ 
sented  with  an  input  vector  from  the  training  set  of  vectors 
it  is  required  to  learn  through  V,  and  its  convergent  output 
state  is  observed  on  V,  and  compared  with  the  desired 
output  (association)  to  produce  an  error  signal  which  is 
used  in  turn  according  to  a  prescribed  formula  to  update 
the  weights  matrix.  This  process  of  error-driven  adaptive 
weights  modification  is  repeated  a  sufficient  number  ot  times 
for  each  vector  and  all  vectors  of  the  training  set  until  in¬ 
puts  evoke  the  correct  desired  output  or  association  at  the 
output.  At  that  time  the  net  can  be  declared  as  having 
captured  the  underlying  structure  of  the  environment  (the 
vectors  presented  to  it)  by  forming  an  internal  represen¬ 
tation  of  the  rules  governing  the  mappings  of  inputs  into 
the  required  output  associations. 

Many  error-driven  learning  algorithms  have  been  pro¬ 
posed  and  studied.  The  most  widely  used,  the  error  back- 
projection  algorithm  (see  Werbos,  Parker,  and  Rumelhart 
et  al.  in  list  of  further  reading),  is  suited  for  use  in  feed 
forward  multilayered  nets  that  are  void  of  feedback  be¬ 
tween  the  neurons.  The  architecture  of  Fig.  4(a)  has  been 
successfully  employed  in  the  initial  demonstration  of  su 
pervised  stochastic  learning  by  simulated  annealing.  Our 
interest  in  stochastic  learning  stemmed  from  a  desire  to 
better  understand  the  possible  role  of  noise  in  BNNs  and 
to  find  means  for  accelerating  the  simulated  annealing 
process  through  the  use  of  optics  and  optoelectronic  hard¬ 
ware.  For  anv  input-output  association  clamped  on  V,  and 
V,  and  beginning  from  an  arbitrary  W„  that  could  be  ran¬ 
dom,  the  net  is  annealed  through  the  hidden  neurons  bv 
subjecting  them  to  optically  injected  noise  in  the  form  ot  a 

IEEE  CIRCUITS  AND  DEVICES  MAGAZINE 


noise  component  added  to  the  threshold  values  of  the  neu¬ 
rons  as  depicted  by  8„,  in  Fig.  3. 

The  source  of  controlled  noise  used  in  this  implementa¬ 
tion  was  realized  bv  imaging  a  slice  of  the  familiar  "snow 
pattern"  displayed  on  an  empty  channel  of  a  television 
receiver,  whose  brightness  could  be  varied  under  computer 
control,  onto  the  PD  array  of  Fig.  4(a).  This  produces  con¬ 
trolled  perturbation  or  shaking"  of  the  energy  landscape 
of  the  net  which  prevents  its  getting  trapped  into  a  state 
of  local  energy  minimum  during  iteration  and  guarantees 
its  reaching  and  staying  in  the  state  of  the  global  energy 
minimum  or  one  close  to  it.  This  requires  that  the  injected 
noise  intensity  be  reduced  gradually,  reaching  zero  when 
the  state  of  global  energy  minimum  is  reached  to  ensure 
that  the  net  will  stay  in  that  state.  Gradual  reduction  of 
noise  intensity  during  this  process  is  equivalent  to  reducing 
the  "temperature"  of  the  net  and  is  analogous  to  the  an¬ 
nealing  of  a  crystal  melt  to  arrive  at  a  good  crystalline  struc¬ 
ture.  It  has  accordingly  been  called  simulated  annealing  by 
early  workers  in  the  field. 

Finding  the  global  minimum  of  a  "cost"  or  energy  func¬ 
tion  is  a  basic  operation  encountered  in  the  solution  of  op¬ 
timization  problems  and  is  found  not  only  in  stochastic 
learning.  Mapping  optimization  problems  into  stochastic 
nets  of  this  type,  combined  with  fast  annealing  to  find  the 
state  of  global  "cost  function"  minimum,  could  be  a  pow¬ 
erful  tool  tor  t heir  solution.  The  net  behaves  then  as  a  sto¬ 
chastic  dynamical  analog  computer.  In  the  case  considered 
here,  however,  optimization  through  simulated  annealing 
is  utilized  to  obtain  and  list  the  convergent  states  at  the 
end  of  annealing  bursts  when  the  training  set  of  vectors 
(the  desired  associations)  are  clamped  to  V,  and  V,.  This 
yields  a  table  or  listing  of  convergent  state  vectors  from 
which  a  probability  P„  of  finding  the  i-th  neuron  and  the  j- 
th  neuron  on  at  the  same  time  is  computed.  This  completes 
the  first  phase  of  learning.  The  second  phase  of  learning 
involves  clamping  the  V,  neurons  only  and  annealing  the 
net  through  H  and  V,,  obtaining  thereby  another  list  of 
convergent  state  vectors  at  the  end  of  annealing  bursts  and 
calculating  another  probability  P'„  of  finding  the  i-th  and  j- 
th  neurons  on  at  the  same  time.  The  connectivity  matrix, 
implemented  in  a  programmable  magneto-optic  SLM 
(MOSLM),  is  modified  then  by  3W,  =  t(P„  -  P  „)  computed 
by  the  computer  controller  where  t  is  a  constant  controlling 
the  learning  rate.  This  completes  one  learning  cycle  or  ep¬ 
isode.  The  above  process  is  repeated  again  and  again  until 
the  W„  stabilizes  and  captures  hopefully  the  underlying 
structure  of  the  training  set.  Many  learning  cycles  are  re¬ 
quired  and  the  learning  process  can  be  time-consuming 
unless  the  annealing  process  is  sufficiently  fast. 

We  have  found  that  the  noisy  thresholding  scheme  leads 
the  net  to  anneal  and  find  the  global  energy  minimum  or 
one  close  to  it  in  about  35  time  constants  of  the  neurons 
used.  For  microsecond  neurons  this  could  be  10M0'  times 
faster  than  numerical  simulation  of  stochastic  learning  by 
simulated  annealing  which  requires  random  selection  of 
neurons  one  at  a  time,  switching  their  states,  and  accepting 
the  change  of  state  in  such  a  way  that  changes  leading  to 
an  energy  decrease  are  accepted  and  those  leading  to  en¬ 
ergy  increases  are  allowed  with  a  certain  controlled  prob¬ 
ability. 

The  computer  controller  in  Fig.  4  performs  several  func¬ 
tions.  It  clamps  the  input/output  neurons  to  the  desired 
states  during  the  two  phases  of  learning,  controls  the  an¬ 
nealing  profile  during  annealing  bursts,  monitors  the  con- 
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vergent  state  vectors  of  the  net,  and  computes  and  executes 
the  weights  modification.  For  reasons  related  to  the  ther¬ 
modynamical  and  statistical  mechanical  interpretation  of  its 
operation,  the  architecture  in  Fig.  4(a)  is  called  a  Boltzmann 
learning  machine.  A  pictorial  view  of  an  optoelectronic 
(photonic)  hardware  implementation  of  a  fully  opeM*’'m A 
Boltzmann  learning  machine  is  shown  in  Fig.  4(b).  This 
machine  was  built  around  a  MOSLM  as  the  adaptive  weights 
mask. 

The  interconnection  matrix  update  during  learning  re¬ 
quires  small  analog  modifications  AW,,  in  W Pixel  trans¬ 
mittance  in  the  MOSLM  is  binary,  however.  Therefore  a 
scheme  for  learning  with  binary  weights  was  developed 
and  used  in  which  IV, ,  is  made  1  if  (P„-P'„)>M  regardless 
of  its  preceeding  value,  where  M  is  a  constant,  and  made 
-1  if  (P„- P'„)< -M  regardless  of  its  preceedmg  value, 
and  is  left  unchanged  if  —  Ms(P„  —  P'„)£M.  This  intro¬ 
duces  inertia  to  weights  modification  and  was  tound  to 
allow  a  net  of  N  =  24  neuron  partitioned  into  8-8-8  groups 
to  learn  two  autoassociations  with  95  percent  score  (prob¬ 
ability  of  correct  recall)  when  the  value  ot  M  was  chosen 
randomly  between  (0-.5)  for  each  learning  cycle.  This  score 
dropped  to  70  percent  in  learning  three  autoassociations. 
However,  increasing  the  number  of  hidden  neurons  trom 
8  to  16  was  found  to  yield  perfect  learning  (100  percent 
score). 

Scores  were  collected  after  100  learning  cycles  by  com¬ 
puting  probabilities  of  correct  recall  of  the  training  set.  Fast 
annealing  by  the  noisy  thresholding  scheme  was  found  to 
scale  well  with  size  of  the  net,  establishing  the  viability  of 
constructing  larger  optoelectronic  learning  machines.  In  the 
following  section  two  schemes  for  realizing  large-scale  nets 
are  briefly  described.  One  obvious  approach  discussed  is 
the  clustering  of  neural  modules  or  chips.  This  approach 
requires  that  neurons  in  different  modules  be  able  to  com¬ 
municate  with  each  other  in  parallel,  if  fast  simulated  an¬ 
nealing  bv  noisy  thresholding  is  to  be  carried  out.  This 
requirement  appears  to  limit  the  number  ot  neurons  per 
module  to  the  number  of  interconnects  that  can  be  made 
from  it  to  other  modules.  This  is  a  thorny  issue  in  VLSI 
implementation  of  cascadeable  neural  chips  (see  Alspector 
and  Allen  in  list  of  further  reading).  It  provides  a  strong 
argument  in  favor  of  optoelectronic  neural  modules  that 
have  no  such  limitation  because  communication  between 
modules  is  carried  out  bv  optical  means  and  not  bv  wire. 


Large  Scale  Networks 

To  date  most  optoelectronic  implementations  of  neural 
networks  have  been  prototype  units  limited  to  few  tens  or 
hundreds  of  neurons.  Use  of  neurocomputers  in  practical 
applications  involving  fast  learning  or  solution  of  optimi¬ 
zation  problems  requires  larger  nets.  An  important  issue, 
therefore,  is  how  to  construct  larger  nets  with  the  pro¬ 
grammability  and  flexibility  exhibited  by  the  Boltzmann 
learning  machine  prototype  described.  In  this  section  we 
present  two  possible  approaches  to  forming  large-scale  nets 
as  examples  demonstrating  the  viability  of  the  photonic 
approach.  One  is  based  on  the  concept  ot  a  clusterable 
integrated  optoelectronic  neural  chip  or  module  that  can 
be  optically  interconnected  to  form  a  larger  net,  and  the 
second  is  an  architecture  in  which  2-D  arrangement  ot  neu¬ 
rons  is  utilized,  instead  of  the  1-D  arrangement  described 
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Fig-  5  Optoelectronic  neural  net  einpliiyiny  internal  feedback  ami  tiro 
orthogonal  nonlinear  reflector  armu s  consisting  of  channels  of 

nonlinear  h^lit  amplifiers  iphotodctcctor-.  Ihicsholdtny  nniplincrs,  LCDs 
and  LLD  drivers!,  a)  architecture,  tin  detail  of  mn-i  and  smyle  element 
of  nonlinear  reflector  array,  tel  and  tdi  optoelectronic  neural  chip  concept 
and  cluster  ot  four  chips,  tc)  neural  chip  tor  >armmg  clusters  of  more  than 
four  chips. 


in  earlier  sections,  in  order  to  Increase  packing  densitv  and 
to  provide  compatibility  with  2-D  sensory  data  formats. 

Clusterable  Photonic  Neural  Chips 

The  concept  of  a  clusterable  photonic  neural  chip,  which 
is  being  patented  by  the  University  of  Pennsylvania,  is  ar¬ 
rived  at  by  noting  that  when  the  connectivity  matrix  is  sym¬ 
metrical,  the  architectures  we  described  earlier  (see  Figs.  3 
or  4(a))  can  be  modified  to  include  internal  optical  feedback 
and  nonlinear  "reflection "  (optoelectronic  detection,  am¬ 
plification,  thresholding  and  light  emission  or  modulation) 
on  both  sides  of  the  connectivity  mask  W  or  nonvolatile 
SUM  (e.g.,  a  MOSLM)  as  depicted  in  Fig.  5  (see  Farhat 
(1987)  in  list  of  further  reading).  The  nonlinear  reflector 
arrays  are  basically  retro-reflecting  optoelectronic  or  pho¬ 
tonic  light  amplifier  arrays  that  receive  and  retransmit  light 
on  the  same  side  facing  the  MOSLM. 

Two  further  modifications  are  needed  to  arrive  at  the 


concept  ot  clusterable  integrated  optoelectronics  or  pho¬ 
tonic  neural  chips.  One  is  replacement  ot  the  LEDs  or  the 
nonlinear  reflector  arrays  bv  suitable  spatial  light  modula¬ 
tors  of  the  fast  ferroelectric  liquid  crystal  variety  tor  ex¬ 
ample,  and  extending  the  elements  ot  the  nonlinear  reflector 
arrays  to  torm  stripes  that  extend  beyond  the  dimensions 
of  the  connectivity  SLM,  and  sandwiching  the  latter  be¬ 
tween  two  such  striped  nonlinear  reflector  arrays  oriented 
orthogonally  to  each  other  as  depicted  in  Fig.  ?ici.  Thi> 
produces  a  photonic  neural  chip  that  operates  in  an  am¬ 
bient  light  environment.  Analog  integrated  circuit  tlUi 
technology  would  then  be  used  to  fabricate  channels  ot 
nonlinear  (thresholding)  amplifiers  and  SLM  drivers,  one 
channel  for  each  PD  element.  The  minute  1C  chip  thus 
fabricated  is  mounted  as  an  integral  part  on  each  PDA  SLM 
assembly  of  the  nonlinear  reflector  arrays.  Individual  chan¬ 
nels  of  the  1C  chip  are  bonded  to  the  PDA  and  SLM  ele¬ 
ments.  Two  such  analog  1C  chips  are  needed  per  neural 
chip.  The  sue  ot  the  neural  chip  is  determined  by  the  num¬ 
ber  of  pixels  in  the  SLM  used. 

An  example  ot  tour  such  neural  chips  connected  optoe- 
leetronicallv  to  torm  a  larger  net  bv  clustering  is  shown  in 
Fiu.  5(d).  This  is  achieved  by  simply  aligning  the  ends  ot 
the  stripe  I'D  elements  in  one  chip  with  the  ends  of  the 
stripe  SLM  elements  in  the  other.  It  is  dear  that  the  hybrid 
photonic  approach  to  torming  the  neural  chip  would  uln- 
matelv  and  preferably  be  replaced  bv  an  entirely  integrated 
photonic  approach  and  that  neural  chips  wuth  the  slighflv 
different  torm  shown  in  Fig.  5(e)  can  be  utilized  to  torm 
clusters  of  more  than  four.  Large-scale  neural  nets  pro¬ 
duced  by  clustering  integrated  photonic  neural  chips  have 
the  advantage  of  enabling  any  partitioning  arrangement, 
allowing  neurons  in  the  partitioned  net  to  communicate 
with  each  other  tn  the  desired  fashion  enabling  last  an¬ 
nealing  bv  noisy  thresholding  to  be  carried  out,  and  ot 
being  able  to  accept  both  optically  iniected  signals  (through 
the  PDAs)  or  electronically  injected  signals  (through  the 
SLMs)  in  the  nonlinear  reflector  arravs,  facilitating  com¬ 
munication  with  the  environment.  Such  nets  are  theretore 
capable  of  both  deterministic  or  stochastic  learning.  Com¬ 
puter  controlled  electronic  partitioning  and  loading  and  up¬ 
dating  ot  the  connectivity  weights  in  the  connectiv  ity  SLM 
(which  can  be  ot  the  magneto-optic  variety  or  the  nonvol¬ 
atile  ferroelectric  liquid  crystal  (FeLCSLM)  variety)  is  as¬ 
sumed.  This  approach  to  realizing  large-scale  fully 
programmable  neural  nets  is  currently  being  developed  in 
our  laboratory,  and  illustrates  the  potential  role  integrated 
photonics  could  play  in  the  design  and  construction  ot  a 
new  generation  of  analog  computers  intended  for  use  in 
neurocomputing  and  rapid  simulation  and  study  of  nonlin¬ 
ear  dynamical  systems. 

Neural  Nets  with  Two-Dimensional  Deploymet  t  of 
Neurons 

Neural  net  architectures  in  which  neurons  are  arranged 
in  a  two-dimensional  (2-D)  format  to  increase  packing  den¬ 
sity  and  to  facilitate  handling  2-D  formatted  data  have  re¬ 
ceived  early  attention  (see  Farhat  and  Psaltis  (1987)  in  list 
of  further  reading).  These  arrangements  involve  a  2-D  N 
x  N  state  "vector"'  or  matrix  s.,  representing  the  state  o' 
neurons,  and  a  four-dimensional  (4-D)  connectivity  "ma¬ 
trix"  or  tensor  T„t,  representing  the  weights  of  svnapses 
between  neurons.  A  scheme  for  partitioning  the  4-D  con¬ 
nectivity  tensor  into  an  N  x  N  array  ot  submatnees,  each 
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Ay.  6  Three  optoelectronic  netuvrk  architecture 5  m  which  the  neurons 
arc  arranged  in  two-dimensional  format  employing  (a)  parallel  nonlinear 
electronic  amplification  and  feedback,  (b)  serial  nonlinear  electronic  am¬ 
plification  and  feedback,  (c!  parallel  nonlinear  electron  optica!  amplification 
and  feedback. 

of  which  has  \T  x  N  elements,  to  enable  storing  it  in  a  flat 
2-D  photomask  or  SLM  for  use  in  optoelectronic  imple¬ 
mentation  has  been  developed  (see  Farhat  and  Psaltis  1987 
in  list  of  further  reading).  Several  arrangements  are  possi¬ 
ble  using  this  partitioning  scheme  (see  Fig.  6). 

In  Fig.  6(a),  neuron  states  are  represented  with  a  2-D  LED 
array  (or  equivalently  with  a  2-D  SLM).  A  two-dimensional 
lenslet  array  is  used  to  spatially  multiplex  and  protect  the 
state  vector  display  onto  each  of  the  submatrices  of  the 
partitioned  connectivity  mask.  The  product  of  the  state  ma¬ 
trix  with  each  of  the  weights  stored  in  each  submatrix  is 
formed  with  the  help  of  a  spatially  integrating  square  pho¬ 
todetector  of  suitable  size  positioned  behind  each  subma- 
tnx.  The  (i-|)th  photodetector  output  represents  the  activation 
potentials  u„  of  the  (i-|)th  neurons.  These  activation  poten¬ 
tials  are  nonlinearlv  amplified  and  fed  back  in  parallel  to 
drive  the  corresponding  elements  of  the  LED  state  array  of 
those  ot  the  state  SLM.  In  this  fashion,  weighted  intercon¬ 
nections  between  all  neurons  are  established  by  means  of 


the  lenslet  arrav  instead  of  the  optical  crossbar  arrangement 
used  to  establish  connectivity  between  neurons  when  they 
are  deployed  on  a  line. 

Both  plastic  molded  and  glass  micro-lenslet  arravs  can 
be  fabricated  todav  in  2-D  formats.  Class  micro-lenslet  ar¬ 
ravs  with  density  of  9  to  25  lenslets/mm-  can  be  made  in 
large  areas  using  basically  photolithographic  techniques. 
Resolution  ot  up  to  -50  Ip'mm  can  also  be  achieved. 
Therefore,  a  micro  lenslet  arrav  of  (100 x  llKbmrn-,  tor  ex¬ 
ample,  containing  easilv  10’  lenslets  could  be  used  to  term 
a  net  of  10s  neurons  provided  that  the  required  nonlinear 
light  amplifiers  (photodetector/thresholding  amplifier  LED 
or  SLM  driver  arrav)  become  available.  This  is  another  in¬ 
stance  where  integrated  optoelectronics  technology’  can  plav 
a  central  role.  We  have  built  a  8  x  8  neuron  version  ot  the 
arrangement  in  Fig.  6(a)  employing  a  square  LED  arrav,  a 
square  plastic  lenslet  array,  and  a  square  PDA,  each  ot 
which  has  8x8  elements  in  which  the  state  update  was 
computed  serially  bv  a  computer  which  sampled  the  acti¬ 
vation  potentials  provided  by  the  PDA  and  furnished  the 
drive  signals  to  the  LED  arrav.  The  connectivity  weights  in 
this  arrangement  were  stored  in  a  photographic  mask  which 
was  formed  with  the  help  of  the  system  itself  in  the  follow  ¬ 
ing  manner:  Starting  trom  a  set  of  unipolar  binary  matrices 
h,  to  be  stored  in  the  net,  the  required  4-D  connectivity 
tensor  was  obtained  bv  computing  the  sum  of  the  outer 
products  of  the  bipolar  binary  versions  v„  =  2b,,  -  1 .  The  re¬ 
sulting  connectivity  tensor  was  partitioned  and  unipolar 
binary  quantized  versions  of  its  submatrices  were  displayed 
in  order  by  the  computer  on  the  LED  display  and  stored 
at  their  appropriate  locations  in  a  photographic  plate  placed 
in  the  image  plane  of  the  lenslet  array  by  blocking  all  ele¬ 
ments  of  the  lenslet  array  except  the  one  where  a  particular 
submatrix  w-as  to  be  stored.  This  process  was  automated 
with  the  aid  of  a  computer  controlled  positioner  scanning 
a  pinhole  mask  in  front  of  the  lenslet  array  so  that  the 
photographic  plate  is  exposed  to  each  submatrix  of  the  con¬ 
nectivity  tensor  displayed  sequentially  by  the  computer. 
The  photographic  plate  was  then  developed  and  positioned 
back  in  place.  Although  time-consuming,  this  method  of 
loading  the  connectivity  matrix  in  the  net  has  the  advantage 
of  compensating  for  all  distortions  and  aberrations  of  the 
system. 

The  procedure  for  loading  the  memory  in  the  system  can 
be  speeded  up  considerably  by  using  an  array  ot  minute 
electronically  controlled  optical  shutters  (switches)  to  re¬ 
place  the  function  of  the  mechanically  scanned  pinhole. 
The  shutter  array  is  placed  |ust  in  front  or  behind  the  lenslet 
array  such  that  each  element  of  the  lenslet  array  has  a  corre¬ 
sponding  shutter  element  in  register  with  it.  An  electron¬ 
ically  addressed  ferroelectric  liquid  crystal  spatial  liyht 
modulator  (FeLCSLM)  (see  Spatial  Light  Modulators  and 
Applications  in  list  of  further  reading)  is  a  suitable  candi 
date  for  this  task  because  of  its  fast  switching  speed  (a  tew 
microseconds).  Development  of  FeLCSLMs  is  being  pur¬ 
sued  worldwide  because  of  their  speed,  high  con'rast,  and 
bistabilitv  which  enables  nonvolatile  switching  ot  pixel 
transmission  between  two  states.  These  features  make 
FeLCSLMs  also  attractive  for  use  as  programmable  con¬ 
nectivity  masks  in  learning  networks  such  as  the  Boltz¬ 
mann  machine  in  place  of  the  MOSLM  presently  in  use. 

Because  the  connectivity  matrix  was  unipolar,  an  adap¬ 
tive  threshold  equal  to  the  mean  or  energy  of  the  iterated 
state  vector  was  found  to  be  required  in  computing  the 
update  state  to  make  the  network  function  as  an  associative 
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memory  that  performed  in  accordance  with  theoretical  pre¬ 
dictions  of  storage  capacity  and  for  successful  associative 
search  when  sketchy  (noisy  and'or  partial)  inputs  are  pre¬ 
sented.  Recent  evidence  in  our  work  is  showing  that  ligistic 
neurons,  mentioned  in  a  footnote  earlier,  allow  using  un¬ 
ipolar  connectivity  vveights  in  a  network  without  having  10 
resort  to  adapnve  tnresholding.  This  behavior  may  be  caused 
by  the  possibility  that  logistic  neurons,  with  their  '  humped'' 
nonsigmoidal  response,  combine  at  once  features  of  exci¬ 
tatory  and  inhibitory  neurons  which,  from  all  presently 
available  evidence,  is  biologically  not  plausible.  Biological 
plausibility,  it  can  be  argued,  is  desirable  tor  guiding  hard¬ 
ware  implementations  of  neural  nets  but  is  not  absolutely 
necessary  as  long  as  departures  from  it  facilitate  and  sim¬ 
plify  implementations  without  sacrificing  (unction  and  flex¬ 
ibility. 

Several  vanations  of  the  above  basic  2-D  architecture  were 
studied.  One,  shown  in  Fig.  6(b)  employs  an  array  of  light 
integrating  elements  (lenslet  array  plus  diffusers,  for  ex¬ 
ample)  and  a  CCD  camera  plus  serial  nonlinear  amplifica¬ 
tion  and  driving  to  display  the  updated  state  matrix  on  a 
display  monitor.  In  Fig.  6(c)  a  microchannel  spatial  light 
modulator  (MCSLM)  is  employed  as  an  electron-optical  ar¬ 
ray  ot  thresholding  amplifiers  and  to  simultaneously  dis¬ 
play  the  updated  state  vector  in  coherent  laser  light  as  input 
to  the  system  The  spatial  coherence  of  the  state  vector 
display  in  this  case  also  enables  replacing  the  lenslet  array 
with  a  fine  2-D  grating  to  spatially  multiplex  the  displaced 
image  onto  the  connectivity  photomask.  Our  studies  show 
that  the  2-D  architectures  described  are  well  suited  to-  im¬ 
plementing  large  networks  with  semi-globn!  <  :ocdi  rather 
than  global  interconnects  between  neuo.is,  vvith  each  neu¬ 
ron  capable  of  communicating  with  i.  -*  to  few  thousand 
neurons  in  its  vicinity  depending  on  lenslet  resoiuiiv:  vul 
geometry.  Adaptive  learning  in  these  architectures  is  also 
possible  provided  a  suitable  erasable  storage  medium  is 
found  to  replace  the  photographic  mask.  For  example  in 
yet  another  conceivable  variant  ot  the  above  architectures, 
the  lenslet  array  can  be  used  to  spatially  demultiplex  the 
connectivity  submatrices  presented  in  a  suitable  Z-D  eras¬ 
able  display,  i.e.  proiect  them  in  perfect  register,  onto  a 
single  SLM  device  containing  the  state  vector  data.  This 
enables  forming  the  activation  potential  array  u  directly 
and  facilitates  carrying  out  the  required  neron  response 
operations  (nonlinear  gain)  optically  and  in  parallel  through 
appropriate  choice  ot  the  state  vector  SLM  and  the  archi¬ 
tecture.  Variations  employing  internal  feedback,  as  in  1-D 
neural  nets,  can  also  be  conceived. 

Discussion 

Optoelectronics  (or  photonics)  offers  clear  advantages  tor 
the  design  and  construction  of  a  new  generation  of  analog 
computers  (neurocomputers)  capable  of  performing  com¬ 
putational  tasks  collectively  and  dynamically  at  very  high 
speed  and  as  such,  are  suited  for  use  in  the  solution  of 
complex  problems  encountered  in  cognition,  optimization, 
and  control  that  have  defied  efficient  handling  with  tradi¬ 
tional  digital  computation  even  when  very  powerful  digital 
computers  are  used.  The  architectures  and  proof  of  concept 
prototypes  described  are  aimed  at  demonstrating  that  the 
optoelectronic  approach  can  combine  the  best  attributes  ot 
optics  and  electronics  together  with  programmable  non¬ 
volatile  spatial  light  modulators  and  displays  to  form  ver¬ 
satile  neural  nets  with  important  capabilities  that  include 
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associative  storage  and  recall,  self  organization  and  adap¬ 
tive  learning  (self-programming),  and  fast  solution  of  op¬ 
timization  problems.  Large-scale  versions  of  these 
neurocomputers  are  needed  for  tackling  real  world  prob¬ 
lems.  Ultimately  these  can  be  realized  using  integrated  op¬ 
toelectronic  (integrated  photonic)  technology  rather  than 
the  hybrid  optoelectronic  approach  presented  here.  Thus, 
new  impetus  is  added  for  the  development  of  integrated 
optoelectronics  besides  that  coming  from  the  needs  of  high 
speed  optical  communication.  One  can  expect  variations  of 
integrated  optoelectronic  repeater  chips  utilizing  GaAs  on 
silicon  technology  being  developed  with  optical  commu¬ 
nication  in  mind  (see  J.  Shibata  and  T.  Kajiwara  in  list  or 
further  reading).  These,  when  fabricated  in  dense  arras 
form,  will  find  widespread  use  in  the  construction  ot  large- 
scale  analog  neurocomputers.  This  class  of  neurocomputers 
will  probably  also  find  use  in  the  study  and  fast  simulation 
of  nonlinear  dynamical  systems  and  chaos  and  its  role  in  a 
vanetv  of  systems. 

Biological  neural  nets  were  evolved  in  nature  for  one 
ultimate  purpose  that  of  maintaining  and  enhancing  sur¬ 
vivability  of  the  organism  they  reside  in.  Embedding  arti¬ 
ficial  neural  nets  in  man-made  systems,  and  in  particular 
autonomous  systems,  can  serve  to  enhance  their  surviva¬ 
bility  and  therefore  reliability.  Survivability  is  also  a  central 
issue  m  a  vanetv  of  systems  with  complex  behavior  en¬ 
countered  in  biology,  economics,  social  studies,  and  mili¬ 
tary  science.  One  can  therefore  expect  neuromorphic 
r  .sin mg  end  neurocomputers  to  play  an  important  role 
;r.  tiic  modeling  and  study  of  such  complex  systems  es¬ 
pecially  it  integrated  optoelectronic  techniques  can  be  made 
to  extend  the  flexibility  and  speed  demonstrated  in  the  pro¬ 
totype  nets  described  to  large  scale  networks.  One  should 
also  expect  that  software  development  tor  emulating  neural 
functions  on  serial  and  parallel  digital  machines  will  not 
continue  to  be  confined,  as  at  present,  to  the  realm  of 
straightforward  simulation,  but  spurred  by  the  mounting 
interest  in  neural  processing,  will  move  into  the  algorithmic 
domain  where  fast  efficient  algonths  are  likely  to  be  de¬ 
veloped,  especially  for  parallel  machines,  becoming  to  neural 
processing  what  the  FFT  (fast  Fourier  transform)  was  to  the 
discrete  Fourier  transform.  Thus  we  expect  that  ad\u.itC 
in  neuromorphic  analog  and  digital  signal  processing  will 
proceed  in  parallel  and  that  applications  would  draw  on 
both  equally. 
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